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Chapter  1 

Introduction 


Recent  years  have  seen  a  proliferation  in  the  use  of  multi-hop  wireless  networks,  in  diverse 
scenarios  ranging  from  community  mesh  networks,  to  wireless  sensor  networks.  As  these 
networks  are  deployed  and  used  at  increasingly  large  scales,  economic  viability  will  be  an 
important  concern.  Moreover,  in  many  cases,  the  form-factor  of  the  devices  may  be  dictated 
by  the  application  and  deployment  scenario. 

Given  the  cost  and  form-factor  considerations,  one  can  anticipate  that  individual  devices 
may  be  limited  in  their  functionality,  and/or  prone  to  various  forms  of  failure.  For  instance, 
even  though  a  large  number  of  frequencies  may  be  available  for  operation,  an  individual 
device’s  transceiver  may  only  be  capable  of  tuning  to  a  small  number  of  frequencies.  Hard¬ 
ware  failures  may  occur  with  non-negligible  probability,  making  a  device  unusable.  The 
code  on  a  device  may  possibly  be  corrupted  or  compromised.  Despite  these  occurrences,  it 
is  desirable  that  the  network  as  a  whole  be  capable  of  tolerating  some  degree  of  functional 
constraints  and/or  failure  on  the  part  of  individual  nodes,  without  substantially  degrading 
overall  performance. 

While  sensor  networks  constitute  a  major  area  of  interest  for  such  constrained  devices, 
these  concerns  are  by  no  means  exclusively  limited  to  these  very  low-cost,  low  complexity 
devices.  One  can  envision  more  capable  and  complex  systems  being  subject  to  similar 
problems.  In  situations  where  a  large  number  of  spectrally-separated  frequency  bands  are 
available,  individual  devices  may  be  equipped  with  re-configurable  antennas  having  limited 
re-conhgurability.  Sometimes  policy  issues  may  enforce  constraints,  e.g.,  in  cognitive  radio 
networks,  presence  of  active  primary  users  in  some  frequencies  may  render  them  unusable 
by  secondary  users.  Software  bugs  in  distributed  application  code  may  lead  to  erroneous 
behavior.  Nodes  may  crash  and  be  rendered  nonoperational  for  varying  periods  of  time. 
Additionally,  in  certain  scenarios,  one  may  be  willing  to  impose  soft  functional  constraints 
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if  this  reduces  protocol  cost/complexity  without  significantly  affecting  performance. 

The  goal  of  this  research  has  been  to  investigate  the  performance  of  wireless  networks 
that  are  subject  to  various  forms  of  functional  constraints  or  failures.  We  have  focused 
on  two  specific  problem  domains  that  are  very  relevant  to  emerging  scenarios  for  wireless 
network  deployment: 

Multi-channel  wireless  networks  where  nodes  have  radio-interfaces  with  hetero¬ 
geneous  and  constrained  capabilities  Many  existing  wireless  standards,  e.g.,  IEEE 
802.11,  IEEE  802.15.4,  provide  for  multiple  frequency  channels.  However,  most  radio 
transceivers  in  common  use  can  typically  only  be  active  on  any  one  of  the  available  channels 
at  a  time.  Moreover,  each  device  may  only  be  equipped  with  a  small  number  of  transceivers 
(often  only  one).  In  scenarios  with  multiple  active  users,  harnessing  these  multiple  chan¬ 
nels  can  lead  to  substantial  performance  improvement  by  increasing  the  number  of  feasible 
concurrent  transmissions  in  the  network.  This  requires  appropriate  routing  and  scheduling 
strategies  to  distribute  the  traffic  load  across  interfaces  and  channels.  The  complexity  is 
further  increased  when  the  devices  may  be  of  varying  type,  cost  and  capability.  Thus,  they 
may  have  heterogeneous  radio  capabilities  in  terms  of  variable  number  of  available  inter¬ 
faces.  Moreover,  all  interfaces  may  not  be  able  to  switch  on  all  channels,  and  all  channels 
may  not  be  identical.  There  has  been  a  substantial  body  of  work  on  multi-channel  wireless 
networks  in  the  past  few  years.  However,  much  of  it  has  considered  nodes  with  identical 
radios,  with  very  limited  effort  in  the  direction  of  handling  interfaces  with  heterogeneous 
and  constrained  operational  capabilities.  With  the  availability  of  multiple  unlicensed  fre¬ 
quency  bands  for  use,  it  is  increasingly  relevant  to  envision  devices  equipped  with  radios 
that  can  only  operate  on  some  part  of  the  total  available  spectrum.  In  order  to  allow 
a  diverse  set  of  devices  to  operate  as  part  of  a  single  network  while  still  obtaining  good 
performance,  sophisticated  algorithms  for  coordination,  as  well  as  traffic-load  distribution 
will  be  required.  Developing  insight  through  formal  theoretical  models  is  an  important 
precursor  in  that  direction.  As  part  of  this  dissertation  research,  we  have  examined  this 
issue.  We  have  developed  theoretical  models  and  formulated  results  for  the  same.  We  have 
also  designed  a  channel  and  interface  management  protocol  for  multi-channel  multi-radio 
wireless  networks,  which  draws  upon  some  of  the  insights  from  our  theoretical  work,  and 
serves  as  a  proof-of-concept  of  the  potential  of  developing  a  general  design  framework  to 
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handle  a  wide  range  of  heterogeneity  in  hardware  characteristics  and  capabilities. 

Single-channel  wireless  networks  where  nodes  are  prone  to  Byzantine  or  crash- 
stop  failures  Wireless  networks  are  increasingly  finding  use  in  critical  scenarios,  e.g., 
industrial  monitoring  and  actuation,  first-responder  networks,  etc..  In  these  scenarios,  the 
reliability  of  data  communication  is  of  prime  importance,  and  may  often  be  the  most  rele¬ 
vant  metric  of  interest.  Due  to  fundamental  differences  in  the  nature  of  wired  and  wireless 
communication,  the  design  of  reliable  communication  algorithms  for  wireless  networks  re¬ 
quires  a  fresh  approach.  In  particular,  the  wireless  medium  is  a  broadcast  medium,  i.e., 
a  transmission  can  be  received  by  many  receivers  in  the  vicinity  of  the  transmitter.  This 
characteristic  is  both  an  advantage  and  a  disadvantage  from  the  standpoint  of  reliability. 
The  broadcast  characteristic  can  be  exploited  to  improve  reliability  by  designing  algorithms 
that  harness  the  presence  of  multiple  witnesses  to  a  transmitted  message.  At  the  same  time, 
it  lays  transmissions  open  to  the  possibility  of  collisions  and  jamming.  An  influential  model 
for  the  study  of  fault-tolerant  communication  in  the  past  two  decades  has  been  the  Byzan¬ 
tine  fault  model,  and  there  is  a  large  body  of  work  that  studies  Byzantine  fault-tolerant 
communication  under  different  assumptions.  Given  the  distinctive  nature  of  the  wireless 
environment,  new  and  different  algorithms  are  needed  for  this  task.  This  has  led  to  recent 
interest  in  studying  this  problem  in  the  context  of  networks  with  a  local  broadcast  prop¬ 
erty.  As  part  of  this  dissertation  research,  we  have  examined  the  potential  of  exploiting 
the  availability  of  multiple  witnesses  to  a  message  transmission  in  a  wireless  network,,  and 
have  established  conditions  for  the  achievability  of  Byzantine  fault-tolerant  broadcast  in 
a  wireless  network  setting  under  certain  assumed  models.  These  results  provide  insight 
into  the  potential  for  leveraging  the  broadcast  nature  of  the  wireless  medium  to  improve 
reliability. 

1.1  Outline 

The  text  of  this  dissertation  can  be  broadly  categorized  into  two  parts,  each  pertaining  to 
one  of  the  two  problem  domains  discussed  above. 

Chapters  2-6  pertain  to  multi-channel  wireless  networks  where  devices  may  have  hetero¬ 
geneous  and  constrained  capabilities.  In  Chapter  2,  we  introduce  the  model  for  analyzing 
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performance  in  the  presence  of  switching  constraints,  discuss  related  work,  and  present 
some  preliminaries.  In  Chapter  3  and  Chapter  4,  we  present  asymptotic  connectivity  and 
transport  capacity  results  for  adjacent  (c,  /)  assignment  and  random  (c,  /)  assignment  re¬ 
spectively,  and  also  discuss  insights  obtained  from  these  results.  We  consider  the  scheduling 
implications  of  heterogeneous  channels  and  radios  in  networks  of  realistic  scale  in  Chap¬ 
ter  5,  where  we  present  some  results  on  performance  of  certain  maximal  scheduler  in  a 
multi-channel  wireless  network.  In  Chapter  6,  we  describe  the  design  and  evaluation  (via 
simulation)  of  a  channel  and  interface  management  protocol  for  a  heterogeneous  multi¬ 
channel  multi-radio  network,  which  draws  upon  insights  from  the  theoretical  results  in 
previous  chapters,  as  well  as  existing  results  in  the  literature.  We  also  discuss  interesting 
directions  for  future  work. 

Chapters  7-10  pertain  to  reliable  broadcast  in  failure- prone  wireless  networks.  In  Chap¬ 
ter  7,  we  introduce  the  reliable  broadcast  problem  and  discuss  related  work.  In  Chapter  8, 
we  present  results  for  a  locally  bounded  failure  model.  In  Chapter  9,  we  describe  results 
for  a  probabilistic  failure  model.  In  Chapter  10,  we  argue  for  the  need,  as  well  as  the 
potential,  to  evolve  lightweight  probabilistic  mechanisms  for  reliable  communication  that 
exploit  knowledge  of  physical  layer  characteristics  to  achieve  reliability,  and  sketch  out  a 
simple  algorithm  for  reliable  local  broadcast  as  a  proof-of-concept  of  the  same. 

We  conclude  in  Chapter  1 1  by  summarizing  the  contributions  of  the  research  performed 
as  part  of  this  dissertation. 

General  notation  and  terminology  used  extensively  throughout  the  text  is  clarified  in 
Appendix  A.  Other  notation  and  terminology  is  introduced  prior  to  first  use.  Some  well- 
known  facts  and  results  that  have  been  used  in  some  of  the  proofs  are  compiled  together  in 
Appendix  E. 
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Chapter  2 

Interface  Heterogeneity  in  a 
Multi-Channel  Wireless  Network 


Many  existing  wireless  standards  provide  for  multiple  frequency  channels.  For  instance,  the 
widely  used  IEEE  802.11  standard  for  Wireless  Local  Area  Networks  specifies  11  channels 
(of  which  3  are  non-overlapping)  in  the  2.4  GHz  ISM  band,  and  12  channels  in  the  5  GHz 
ISM  band.^  The  IEEE  802.15.4  standard  for  Wireless  Personal  Area  Networks  also  specifies 
16  channels  in  the  2.4  GHz  band. 

However,  typical  radio  transceivers  currently  in  common  use  can  only  be  active  on  any 
one  of  the  available  channels  at  a  time.  Moreover,  each  device  may  only  be  equipped  with  a 
small  number  of  transceivers.  When  there  are  multiple  active  users  in  the  network,  harness¬ 
ing  these  multiple  channels  can  lead  to  substantial  performance  improvement  by  increasing 
the  number  of  feasible  concurrent  transmissions.  This  requires  appropriate  routing  and 
scheduling  strategies  to  distribute  the  traffic  load  across  interfaces  and  channels.  The  com¬ 
plexity  is  further  increased  when  the  devices  may  be  of  varying  type,  cost  and  capability. 
Thus,  they  may  have  heterogeneous  radio  capabilities  in  terms  of  variable  number  of  avail¬ 
able  interfaces.  Moreover,  all  interfaces  may  not  be  able  to  switch  on  all  channels,  and 
all  channels  may  not  be  identical.  There  has  been  a  substantial  body  of  work  on  multi¬ 
channel  wireless  networks  in  the  past  few  years.  However,  much  of  it  has  considered  nodes 
with  identical  radios,  with  very  limited  effort  in  the  direction  of  handling  interfaces  with 
heterogeneous  and  constrained  operational  capabilities.  Given  the  availability  of  multiple 
frequency  bands  for  unlicensed  use,  it  is  increasingly  relevant  to  envision  devices  equipped 
with  radios  that  can  each  only  operate  on  some  part  of  the  total  available  spectrum. 

We  briefly  mention  some  scenarios  of  interest: 

•  The  need  for  low-cost,  low-power  radio  transceivers  to  be  used  in  inexpensive  sensor 
nodes  can  give  rise  to  many  situations  involving  constrained  switching.  Hardware 
^The  number  of  available  channels  varies  in  different  countries  according  to  local  regulations. 
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complexity  (and  hence  cost),  and/or  power  consumption  may  be  significantly  reduced 
if  each  node  operates  only  in  a  small  spectral  range,  and  switches  between  a  small 
subset  of  adjacent  channels  (e.g.,  if  the  transceiver  uses  an  oscillator  with  limited 
tunability).  However,  if  more  spectrum  is  available  than  a  single  device  can  utilize, 
it  may  be  possible  at  time  of  manufacture  to  lock  different  devices  on  to  different 
frequency  ranges.  Another  possible  scenario  is  one  in  which  a  node  may  be  equipped 
with  a  few  simple  radios  each  locked  to  a  single  frequency  at  time  of  manufacture  (a 
similar  scenario  is  proposed  in  [93]  in  the  context  of  untuned  radios) .  Due  to  the  small 
form  factor,  at  most  one  of  these  radios  may  be  able  to  transmit  at  a  time  (receiving 
simultaneously  may  or  may  not  possible).  Thus,  the  net  effect  may  be  similar  to 
having  one  transceiver  that  can  switch  on  a  subset  of  frequencies,  but  only  be  active 
on  one  at  any  given  time. 

•  Another  recent  trend  is  towards  deployment  of  community  mesh  networks,  where 
participants  in  a  community  each  deploy  a  wireless  device  at  their  residence,  and  the 
resultant  network  can  be  used  to  extend  last  mile  Internet  connectivity,  as  well  as 
to  facilitate  peer-to-peer  communication  within  the  community.  Such  networks  are 
typically  not  likely  to  have  a  strongly  centralized  control,  and  there  exists  an  element 
of  organic  growth,  wherein  each  participant  may  choose  to  equip  their  device  with 
commodity  hardware  in  accordance  with  their  willingness  (subject  to  some  minimum 
capability  required  for  inter-operation).  For  instance,  in  a  network  where  all  devices 
are  equipped  with  802.11b  radios,  some  users  may  choose  to  equip  their  devices  with 
additional  802.11a  or  802. llg  radios,  or  may  substitute  their  802.11b  radios  with 
802. llg  radios  (802. llg  is  backward-compatible  with  802.11b). 

While  it  may  be  possible  to  enforce  the  condition  of  uniformity  on  all  devices  in  a  net¬ 
work,  and  thereby  simplify  the  task  of  channel  coordination,  doing  so  forfeits  the  possibility 
of  performance  gains  that  may  be  achieved  if  heterogeneous  capabilities  are  supported.  For 
instance,  in  the  sensor  network  scenarios  discussed  above,  one  could  manufacture  all  devices 
to  operate  on  the  same  small  subset  of  all  available  frequencies,  but  that  entails  leaving  the 
remaining  spectrum  unutilized.  Similarly,  in  the  mesh  network  scenario,  one  could  use 
protocols  that  only  support  802.11b,  but  that  would  imply  a  loss  of  the  opportunity  to 
exploit  the  additional  spectrum  (in  case  of  802.11a),  or  higher  transmission  rates  (in  case 
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of  802. llg). 

Motivated  by  such  concerns,  we  study  the  implications  of  heterogeneous  interface  ca¬ 
pabilities  in  a  wireless  network  by  studying  the  asymptotic  capacity  scaling  behavior  of  a 
network  of  devices  subject  to  constraints  on  the  channels  they  can  operate  on.  While  many 
of  the  above  discussed  scenarios  involve  both  heterogeneous  interfaces  and  heterogeneous 
channel  characteristics,  we  focus  our  effort  on  interfaces  with  limited  and  heterogeneous 
channel  switching  capability,  and  assume  identical  channels  (we  will  consider  the  schedul¬ 
ing  implications  of  channel  heterogeneity  in  Chapter  5). 

In  this  chapter,  we  introduce  some  constraint  models,  describe  the  network  model  for 
our  asymptotic  capacity  results,  and  discuss  related  work.  We  also  state  and  prove  some 
results  pertaining  to  the  traffic  model. 

2.1  Some  Models  for  Channel  Switching  Constraints 

In  this  section,  we  describe  some  switching  constraint  models  that  we  have  formulated  and 
studied.  These  models  assume  that  each  node  possesses  only  one  half-duplex  interface, 
which  can  be  active  on  only  one  channel  at  any  given  time.  There  are  c  channels  avail¬ 
able.  All  channels  are  orthogonal,  and  of  equal  bandwidth.  Each  interface  can  only  switch 
(operate)  on  /  channels  out  of  c,  and  this  set  of  /  channels  is  dictated  by  the  constraint 
model.  These  models  assume  that  c  >  2.  When  c  =  1,  /  can  only  take  one  value,  viz.,  1. 
This  reduces  to  the  case  of  a  single  channel  for  which  connectivity  and  capacity  results  are 
already  known  [42,  43].  Therefore,  c  >  2  is  the  case  of  interest.  Furthermore,  the  models 
assume  that  2  <  /  <  c.  In  Section  2.5,  we  explain  why  c  >  2,  /  =  1  is  disallowed. 

2.1.1  Adjacent  (c, /)  Assignment 

In  this  assignment  model,  an  interface  can  switch  between  a  set  of  /  contiguous  channels 
where  2  <  /  <  c.  We  assume  that  the  available  spectrum  is  in  the  form  of  a  single 
contiguous  frequency  band,  which  is  divided  into  c  channels  numbered  l,2,...,c  in  order 
of  increasing  frequency.  Prior  to  deployment,  each  interface  is  assigned  a  block  location  i 
uniformly  at  random  from  {l,...,c  —  /  -|-1}  and  thereafter  it  can  switch  to  any  channel  in 
the  set  {f,...,f-|-/  —  1}.  This  model  is  relevant  when  each  individual  transceiver  has  limited 
tunability,  and  thus  may  only  switch  between  a  small  set  of  contiguous  channels.  It  is  also 
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possible  to  establish  a  mapping  between  this  model,  and  the  case  of  untuned  radios  [93]. 

2.1.2  Random  (c, /)  Assignment 

In  this  assignment  model,  an  interface  is  assigned  a  subset  of  /  channels  (2  <  /  <  c) 
uniformly  at  random  from  the  set  of  all  possible  channel  subsets  of  size  /.  This  model  can 
capture  situations  where  tiny  low-cost  sensor  nodes  may  be  equipped  with  a  transceiver 
having  a  bank  of  /  switchable  hlters  (e.g.,  a  design  with  a  hlter-bank  has  been  proposed  in 
[86]).  This  model  can  also  capture  scenarios  involving  small  form-factor  nodes  which  are 
equipped  with  a  few  simple  radios,  each  locked  to  a  single  random  frequency  at  manufacture 
time.  Due  to  the  small  form-factor  (leading  to  a  very  small  separation  between  the  radios), 
it  would  typically  be  infeasible  for  more  than  one  radio  to  active  simultaneously.  Thus,  the 
net  effect  would  be  as  if  each  node  is  equipped  with  a  single  radio  that  can  switch  over  a 
random  subset  of  channels. 

2.2  Asymptotic  Capacity  Analysis 

In  their  seminal  paper  [43],  Gupta  and  Kumar  introduced  the  approach  of  asymptotic 
capacity  analysis  to  understand  the  scaling  behavior  of  a  wireless  network,  as  the  network 
size  increases  towards  inhnity.  They  dehned  a  quantity-the  transport  capacity-as  a  measure 
of  the  network’s  ability  to  transfer  data. 

Two  network  models  were  considered  in  [43];  Arbitrary  networks,  and  Random  networks. 
Of  these,  we  discuss  random  networks,  as  this  the  model  we  utilize  for  our  results.  In  the 
random  network  case,  n  nodes  are  located  uniformly  at  random  in  the  network  region. 
Each  node  is  the  source  of  exactly  one  flow.  It  chooses  its  destination  by  choosing  a  point 
uniformly  at  random  and  selecting  the  node  closest  to  it  other  than  itself.  Given  this  traffic 
model,  the  average  distance  traversed  by  a  flow  is  of  the  same  order  as  the  network  diameter. 

In  a  random-network,  the  per-flow  network  capacity  is  said  to  be  0(A(n))  if  there  exist 
constants  ci,C2  such  that: 

lim  Pr[  throughput  ciA(n)  is  achievable  for  each  flow  ]  =  1  (2-1) 


lim  Pr[  throughput  C2X{n)  is  achievable  for  each  flow  ]  <  1 


(2.2) 


Two  models  for  interference  were  defined  in  [43],  viz.,  the  Protocol  Model  and  the  Physical 
Model.  Of  these,  the  Protocol  Model  for  a  random  network  is  defined  as  follows; 

All  nodes  in  the  network  use  a  common  transmission  range  r(n).  A  transmission  from 
a  node  A  to  a  node  B  is  successful  if  and  only  if  the  distance  AB  <  r{n)  and  for  any  other 
concurrently  active  transmitter  C,  the  distance  BC  >  (1  +  A)r{n),  where  A  is  a  constant 
which  embodies  a  guard-zone  needed  to  prevent  interference. 

2.3  Assumed  Network  Model 

We  assume  a  random  network  with  the  Protocol  model  of  interference.  We  now  describe 
the  details  of  the  model. 

n  nodes  are  located  uniformly  at  random  in  a  unit  area  torus. ^  All  nodes  use  a  common 
transmission  range  r(n),  which  can  be  appropriately  selected.^. 

There  are  c  available  channels  of  bandwidth  ^  each.  We  focus  on  the  case  where 
the  total  number  of  available  channels  c  =  O(logn).  This  is  reasonable  because,  in  large 
scale  deployments,  the  number  of  nodes  will  typically  be  much  larger  than  the  number  of 
available  channels.  Besides,  when  c  =  ti;(logn),  there  is  a  substantial  capacity  degradation 
even  with  unconstrained  channel  switching  (as  shown  in  [65]),  thus  making  channelization 
an  increasing  liability.  Constrained  switching  can  only  lead  to  additional  degradation,  and 
potentially  unacceptable  performance. 

We  assume  the  same  traffic  model  as  in  [43]; 

Each  node  is  source  of  exactly  one  flow.  It  chooses  a  point  uniformly  at  random  (we 
shall  henceforth  refer  to  these  points  as  pseudo- destinations),  and  selects  the  node  (other 
than  itself)  lying  closest  to  that  point  as  its  destination. 

^Since  the  Protocol  Model  does  not  involve  an  explicit  power  constraint,  the  unit  area  assumption  in  the 
Protocol  Model  can  be  viewed  as  simply  a  normalization  of  a  general  area  A.  Capacity  results  (in  bits/sec) 
for  the  unit-area  continue  to  hold  for  a  torus  of  general  area  A.  Results  in  bit-meters/sec  can  be  obtained 
by  simply  multiplying  the  unit  area  results  with  pA.  Results  regarding  critical  range  for  connectivity  also 
simply  require  a  scaling  by  a  factor  of  VA.  We  also  remark  that  from  a  physical  standpoint,  the  relevant 
interpretation  is  indeed  that  involving  an  extended  network  region  (whose  area  increases  as  n  increases), 
else  as  argued  in  [28],  scenarios  with  ever-increasing  network  density  cease  to  be  physically  relevant. 

®  Although  we  denote  it  by  r(n),  the  transmission  range  can  potentially  be  a  function  not  only  of  n,  but 
also  c  and  /. 
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2.4  Related  Work 


Connectivity  and  Capacity  of  Wireless  Networks  There  is  a  substantial  body  of 
prior  work  on  deriving  the  conditions  under  which  a  given  network  is  connected,  and  condi¬ 
tions  for  connectivity  have  been  formulated  in  the  context  of  many  different  network  models. 
For  a  unit  area  network  with  uniformly  distributed  node  placement,  where  nodes  have  a 
common  transmission  radius  r(n),  it  was  shown  in  [42]  that  if  7rr^  =  (^°sn+b{n))  ^ 
network  is  asymptotically  connected  with  probability  1  iff  6(n)  ^  oo.  An  alternate  model 
was  considered  in  [123],  where  nodes  deployed  uniformly  at  random  may  individually  mod¬ 
ulate  their  transmission  power  (and  hence  range)  to  ensure  that  they  have  a  certain  number 
of  neighbors.  It  was  proved  that  each  node  must  be  connected  to  0(logn)  neighbors  for 
asymptotic  connectivity  with  probability  1.  The  issue  of  theta-coverage  and  connectivity 
was  considered  in  [124].  Another  relevant  body  of  work  is  that  on  bond  percolation  in 
wireless  networks,  e.g.  [34]. 

In  [43],  Gupta  and  Kumar  defined  the  notion  of  asymptotic  transport  capacity  of  a 
wireless  network,  and  obtained  results  for  the  capacity  of  arbitrary  and  random  networks 
in  a  single-channel  single- interface  scenario  for  two  models  of  interference,  viz.,  the  Protocol 
Model  and  the  Physical  Model. 

For  the  Protocol  Model,  they  established  that  in  an  arbitrary  network,  the  capacity 
scales  as  bit-m/s  per  flow,  while  in  a  random  network,  it  scales  as  bits/s. 

For  the  Physical  Model,  they  showed  that  capacity  for  random  networks  is  0{^)  and 
n(-^==).  It  was  later  shown  by  Franceschetti  et  al.  in  [35]  that  under  the  Physical  Model, 
a  per- flow  throughput  of  can  be  achieved  in  a  random  network.  While  this  may  seem 

as  closing  the  gap  in  the  result  of  [43],  this  is  not  strictly  the  case,  as  the  model  of  [35]  allows 
use  of  different  data-rates  over  different  links,  but  stipulated  a  common  transmission  power, 
whereas  in  [43],  different  transmission  powers  may  be  used,  but  all  communication  requires 
the  same  SINK  threshold,  implying  that  it  occurs  at  a  single  common  rate  (corresponding 
to  a  case  where  only  one  particular  modulation  scheme  may  be  available).  However,  a 
variation  of  their  construction  proves  the  result  for  the  model  of  [43],  and  this  is  described 
in  [125].  Improved  capacity  bounds  for  the  Protocol  Model  were  presented  in  [1].  This  work 
also  generalized  the  notion  of  exclusion-regions  to  arbitrary  shapes  that  could  potentially 
be  used  to  model  interference  when  using  directional  antennas. 
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It  was  shown  in  [39]  that  mobility  can  increase  the  capacity  of  a  wireless  network,  and 
in  fact  0(1)  throughput  per  flow  is  attainable  when  each  node  is  source  and  destination  for 
exactly  one  flow  each.  The  capacity  of  hybrid  networks  (those  having  some  infrastructure 
support  in  the  form  of  access  points)  was  studied  in  [76]  and  [59]. 

The  throughput-delay  trade-off  was  studied  in  [36],  and  it  was  shown  that  the  optimal 
trade-off  is  given  by  D{n)  =  0(nT(n))  where  D{n)  is  delay,  and  T{n)  is  throughput.  The 
capacity  of  ultra- wideband  (UWB)  networks  was  studied  in  [95],  and  [128]. 

It  was  shown  in  [28]  that  under  the  unit  area  assumption,  the  Physical  Model  breaks 
down  when  n  becomes  very  large,  yielding  a  singularity,  and  for  a  model  involving  a  non¬ 
singular  attenuation  function,  the  per-flow  capacity  would  be  asymptotically  limited  to 
O(^).  Franceschetti  et  al  [80],  have  recently  shown  that  fundamental  laws  of  physics  dictate 
a  limit  of  O(^)  for  per-flow  capacity  scaling  when  n  nodes  are  distributed  over  an  area  of 
order  n. 

In  [77],  it  is  shown  that  for  the  network/traffic  model  of  [43],  and  the  Protocol  Model  of 
interference,  the  use  of  network  coding  only  yields  a  constant  factor  benefit  (this  constant 
factor  is  a  function  of  the  guard  zone  parameter  A  in  the  Protocol  Model). 

A  concise  presentation  of  many  capacity  results  is  available  in  [125]. 

Multi-channel  Networks  It  was  also  shown  in  [43]  that  if  the  available  bandwidth 
W  is  split  into  c  channels,  with  each  node  having  a  dedicated  interface  per  channel,  the 
results  remain  the  same  as  for  a  single-channel,  single-interface  scenario.  However,  an 
interesting,  and  fairly  common,  scenario  arises  when  the  number  of  interfaces  m  at  each 
node  may  be  smaller  than  the  number  of  available  channels  c.  This  issue  was  analyzed  in  [65] 
and  it  was  shown  that  the  capacity  results  are  a  function  of  the  channel-to-interface  ratio 

It  was  also  shown  that  in  the  random  network  case,  there  are  three  distinct  capacity 
regions;  when  ^  =  O(logn),  the  per-flow  capacity  is  when  ^  =  H(logn)  and  also 

O  ^ ,  the  per  flow  capacity  is  0(lFv^),  and  when  ^  ’ 

the  per-flow  capacity  is  0(  "^ )  ■  The  issue  of  interface  switching  delay  was  also 

briefly  considered  in  [65],  and  it  was  shown  that  access  to  some  extra  interfaces  can  allow 
one  to  completely  mask  the  switching  delay. 

In  [63],  an  additional  multi-channel  scenario  is  considered  where  each  node  has  two 
interfaces  that  may  each  be  assigned  a  channel  based  on  traffic  patterns,  but  must  thereafter 
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remain  fixed  on  those.  For  a  permutation  routing  model,  it  was  shown  that  the  capacity 
with  two  fixed  interfaces  is  of  the  same  asymptotic  order  as  that  with  one  fully-switchable 
interface. 

Constraints  on  Channel  Availability  and  Tuning  Situations  in  which  some  channels 
may  be  unavailable  to  some  nodes  have  been  considered  in  some  work  on  cognitive  radio. 
An  area-blocking  model  (with  a  notion  of  a  protected  radius  around  a  primary  user)  is 
considered  in  [98] ,  which  is  similar  to  the  spatially  correlated  channel  assignment  model  we 
briefly  discuss  in  Chapter  2.  However,  the  goal  of  that  work  is  not  to  determine  multi- hop 
capacity.  In  [61],  a  model  is  considered  where  channel-sets  of  neighboring  nodes  may  differ 
by  at  most  k  channels.  Some  algorithms  for  node-discovery  in  such  networks  are  proposed. 
None  of  these  works  has  focused  on  obtaining  a  formal  model  of  such  anticipated  spatially 
correlated  constraints  for  connectivity  and  capacity  analysis. 

It  was  proposed  in  [93]  that  extremely  inexpensive  wireless  devices  can  be  manufactured 
if  it  is  possible  to  handle  untuned  radios  whose  operating  frequency  may  lie  randomly  within 
some  band.  Additional  considered  possibilities  were  that  each  device  may  have  a  small 
number  of  such  untuned  radios.  The  model  of  [93]  involved  a  source  and  destination  capable 
of  transmitting/receiving  on  all  frequencies  concurrently,  that  are  spatially-separated,  and 
must  communicate  via  a  back-plane  of  devices  with  untuned  radios.  A  random  network 
coding  based  approach  was  proposed  to  relay  information  between  the  source-destination 
pair,  and  it  was  shown  that  0(c)  throughput  is  achievable,  where  c  is  the  maximum  number 
of  disjoint  channels  possible. 

On  a  related  note,  constraints  that  are  somewhat  similar  in  spirit  are  also  encountered 
in  optical  networks  with  wavelength-division  multiplexing  (WDM).  In  an  optical  network, 
all  nodes  may  not  be  capable  of  wavelength  conversion  (see,  e.g.,  [108,  73]).  Architectures 
have  been  proposed  for  sparse  wavelength  conversion  [108],  such  that  only  a  small  fraction 
of  nodes  have  wavelength  conversion  capability.  Architectures  where  nodes  have  limited 
conversion  capability  have  also  been  proposed  [71]. 

Systems/ Architectures  with  Limited  Channel-Switching  A  multi-channel  multi¬ 
hop  network  architecture  has  been  considered  in  [99]  in  which  each  node  has  a  single 
transceiver,  and  nodes  have  a  quiescent  channel  to  which  they  tune  when  not  transmit- 
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ting.  A  node  wishing  to  communicate  with  a  destination  tunes  to  its  quiescent  channel, 
and  transmits  the  packet  to  a  neighbor  whose  quiescent  channel  is  the  same  as  that  of 
the  destination.  Thereafter,  the  packet  proceeds  towards  the  destination  on  the  quiescent 
channel.  This  has  some  similarity  to  the  model  and  constructions  in  Section  3.5  and  Section 
4.6.  However,  in  their  case,  channel-transitions  can  happen  trivially  at  the  very  first  hop, 
since  the  source  node  is  always  capable  of  tuning  to  the  destination’s  quiescent  channel.  In 
contrast,  in  our  models,  interfaces  can  only  switch  on  some  channels,  and  this  needs  to  be 
taken  into  account  when  routing  packets. 

2.5  Constraints  that  Limit  Capacity 

In  this  section,  we  briefly  discuss  some  general  constraints  on  the  capacity  of  the  network 
(for  any  channel  assignment  model).  Recall  that  c  is  the  total  number  of  available  channels, 
each  channel  has  bandwidth  and  /  is  the  number  of  channels  any  single  interface  can 
operate  on.  Furthermore,  each  node  is  equipped  with  a  single  interface. 

Source-Destination  Constraint  for  /  =  1 

If  /  =  1,  but  c  >  1,  then  communication  between  a  source  and  its  destination  is  possible  if 
and  only  if  they  are  both  capable  of  operating  on  the  same  channel.  This  may  not  always 
happen  if  the  channels  are  assigned  in  some  random  manner. 

To  illustrate,  consider  the  class  of  switching  constraint  models  where  the  operational 
channel-set  assigned  to  individual  nodes  is  i.i.d.  Suppose,  the  probability  that  i  and  dst{i) 
operate  on  a  common  channel  is  at  most  p.  If  the  traffic  model  is  such  that  any  single  node 
can  be  the  destination  of  only  up  to  D{n)  flows,  then  we  argue  thus: 

We  can  obtain  at  least  L2D(n)+T-l  source-destination  pairs,  such  that  the  nodes  in  each 
pair  are  distinct,  leading  to  independent  probabilities).  The  probability  that,  in  at  least 
one  of  the  n  source-destination  pairs,  the  source  and  destination  do  not  operate  on  the 
same  channel  can  be  lower  bounded  by  the  probability  that  the  source  and  destination  in 
at  least  one  of  these  distinct  pairs  do  not  operate  on  a  common  channel.  This  probability 
is  at  least  1  — pl-2D(n)+iJ  =  i_^  p L2o(n)+iJ _  When  log  ),  probability 

converges  to  1,  as  n  ^  oo.  Hence,  the  network  would  have  zero  capacity.  For  the  adjacent 
(c, /)  and  random  (c, /)  assignments  studied  in  this  dissertation,  with  c  >  2,c  =  O(logn), 


13 


this  condition  indeed  holds.  Therefore  /  =  1  when  c  >  2  yields  zero  capacity.  Therefore, 
our  model  definitions  (Section  2.1)  disallow  this  possibility. 

When  /  >  1,  as  in  the  rest  of  the  discussion  on  asymptotic  transport  capacity  in  this 
dissertation,  this  constraint  does  not  apply. 


Connectivity  Constraint 

This  constraint  was  first  formulated  in  [43] .  Given  that  each  node  is  a  source  in  the  assumed 
traffic  model,  if  even  a  single  node  is  isolated  (i.e.,  partitioned  from  the  rest  of  the  network), 
this  would  imply  that  the  capacity  is  trivially  zero.  Thus,  at  the  very  least  one  requires  that 
no  node  be  isolated.  Suppose  the  necessary  condition  to  avoid  isolated  nodes  is  that  r(n)  = 
Q.{g{n)).  It  follows  from  the  interference  model  that  each  transmission  occupies  a  ) 

area,  this  limits  the  spatial  re-use  in  the  network  to  0{  )  concurrent  transmissions  on 

any  single  channel.  Besides,  each  source-destination  is  separated  by  average  0(1)  distance 
(see  [43]  for  details)  and  hence  average  ©(,:^)  hops.  This  limits  the  per-flow  throughput 
to  0(  ^ 

^  nr{n)  ’ 


Interference  Constraint 


It  was  established  in  [65]  that  the  per  flow  capacity  is  constrained  to  0{W when 
each  node  possesses  a  single  interface  that  is  capable  of  switching  to  any  channel.  Since  it 
is  always  possible  to  simulate  a  switching  constraint  model  in  a  network  where  interfaces 
can  switch  to  any  channel,  any  throughput  achievable  with  switching  constraints  is  also 
achievable  in  the  unconstrained  switching  case.  Therefore,  the  upper  bound  of  ^ 

also  applies  to  adjacent  (c, /)-assignment,  and  random  (c, /)-assignment. 


Destination  Bottleneck  Constraint 

This  constraint  was  first  articulated  in  [65].  If  the  traffic  model  is  such  that  some  node 
can  be  the  destination  of  up  to  D{n)  flows,  the  per-flow  throughput  is  constrained  to  be 
0(^^)),  since  the  destination  must  time-share  its  interface  between  these  D{n)  flows. 

In  the  region  c  =  O(logn),  the  connectivity  constraint  turns  out  to  be  asymptotically 
dominant. 
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2.6  Some  Results  about  the  Traffic  Model 


As  stated  in  Section  2.3,  we  assume  the  traffic  model  of  [43].  We  now  establish  some  general 
results  pertaining  to  this  traffic  model. 


Lemma  1.  The  number  of  flows  for  which  any  node  is  the  destination  is  O(logn)  w.h.p. 

Proof.  Consider  a  flow’s  pseudo-destination  Dh  Consider  a  circle  of  radius  and 

hence  area  centered  around  this  pseudo-destination.  Applying  Lemma  60  to  the  set 

of  n  nodes,  each  such  circle  contains  0(logn)  nodes,  w.h.p.  In  a  rare  scenario,  one  of  these 
nodes  could  potentially  be  the  source  node  for  that  flow.  However,  the  circle  still  has  more 
than  one  node  other  than  the  flow’s  source.  Thus,  the  flow  will  select  some  node  within  this 
circle  as  its  destination.  Hence,  a  flow  will  only  be  assigned  a  destination  within  distance 
^  from  its  pseudo-destination.  Therefore,  a  node  can  only  be  the  destination  for 
flows  whose  pseudo-destination  lies  within  a  distance  from  it.  Applying  Lemma 

60  to  the  set  of  n  pseudo-destinations,  each  circle  of  this  size  contains  O(logn)  pseudo¬ 
destinations  w.h.p.  Thus  the  number  of  flows  for  which  any  node  is  the  destination  is 
O(logn)  w.h.p.  □ 


Lemma  2.  For  large  n,  at  least  one  node  is  the  destination  for  n(logn)  flows  with  a 
probability  at  least  ^(1  —  ^)(1  —  5),  where  6  >  0  is  an  arbitrarily  small  constant. 

Proof.  The  necessary  condition  for  connectivity  in  [42]  (Theorem  2.1  of  [42])  was  established 
by  proving  that  if  we  consider  R{n)  such  that  7ri?^(n)  =  ]2S]}±b{n)^  where  limsup  6(n)  = 
b  <  oo,  then  with  positive  probability,  there  exists  at  least  one  node  x  which  is  isolated,  i.e., 
there  is  no  other  node  within  distance  R{n)  of  x.  In  the  context  of  [42],  this  was  utilized  by 
interpreting  R{n)  as  transmission  range,  and  thus  obtaining  a  lower  bound  for  connectivity. 
However,  we  now  exploit  that  result  in  a  different  manner  to  prove  our  lemma  as  follows: 
Choose  R{n)  =  ,  i.e.,  6(n)  =  6  =  1.  Note  that  in  this  proof,  R{n)  is  not  the 

transmission  range;  it  is  merely  a  chosen  distance  value.  Invoking  Theorem  2.1  from  [42], 
with  probability  p  there  exists  a  node  x  such  that  there  is  no  other  node  within  a  distance 
R{n)  from  it,  where  liminf  p  >  e~^{l  —  e~^)  =  -(1  —  -).  It  follows  (see  Theorem  2.1  in 
[42])  that  p  >  (1  —  e)^(l  —  ^),  for  any  e  >  0,  and  sufficiently  large  n.  Call  this  event  Si. 

Conditioned  on  the  occurrence  of  event  Si,  and  therefore  the  existence  of  such  a  node 
X,  let  us  consider  the  Voronoi  tessellation  generated  by  the  n  nodes.  Evidently,  the  area  of 
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the  Voronoi  polygon  of  x  is  at  least  Note  that  this  tessellation 

constitutes  a  spatial  partition  of  the  network  area.  From  the  definition  of  the  traffic  model, 
it  follows  that  if  a  flow’s  pseudo-destination  falls  within  the  polygon  of  node  x,  then  x  is 
selected  as  that  flow’s  destination,  unless  x  is  itself  the  source  of  that  flow  (since  a  generator 
(node)  is  always  the  nearest  node  to  points  within  its  own  Voronoi  polygon).  Recall  that 
pseudo-destination  locations  are  chosen  uniformly  at  random  over  the  unit  torus.  Let 
Vj,  1  <  i  <  n  be  indicator  variables  such  that  Vj  =  1  if  a;  is  flow  i’s  destination,  and  0  else. 
Then  Pr[Vj  =  l|Ti]  =  0  if  x  is  the  source  of  flow  i  (and  there  is  exactly  one  such  i). 

For  all  other  values  of  i,  x  would  be  selected  as  flow  i’s  destination  if  either  (1)  flow  i’s 
pseudo-destination  falls  in  x’s  Voronoi  polygon  (the  probability  of  this  event  is  given  by  the 
area  of  x’s  Voronoi  polygon,  and  is  thus  at  least  ,  or  (2)  if  flow  i’s  pseudo-destination 

falls  within  the  polygon  of  its  own  source,  and  x  is  the  next-nearest  node  (we  can  ignore 
this  latter  possibility,  as  we  only  require  a  lower  bound,  and  we  therefore  pretend  that  x  is 
chosen  as  the  destination  of  flow  i  if  and  only  if  flow  i’s  pseudo-destination  falls  within  x’s 
Voronoi  polygon). 

In  light  of  the  above,  it  can  be  seen  that  for  all  i  such  that  x  is  not  the  source  of  flow  i: 
Pr[Vi  =  lITi]  >  Let  V  =  ^  Thus  E[X\£i]  >  (1  -  ^)^^^  >  ^ 

i:  X  not  source  of  i 

for  large  n.  Furthermore,  the  XiS  are  independent.  Therefore,  application  of  the  Chernoff 
bound  from  Lemma  53  (with  /3  =  ^)  yields  that: 

)  =  ^  (2.3) 

n32 

Denote  by  £2  the  event  that  some  node  indeed  is  destination  for  at  least  flows.  Using 

(2.3),  we  obtain  that  Pr[T2|Ti]  >1 - Also,  Pr[T2]  >  Pr[Ti]  Pr[T2|Ti].  Hence  at  least 

one  node  is  a  destination  for  H(logn)  flows  with  a  probability  at  least  (1  — e)e“^(l  — e“^)(l  — 
^^)  >  -(1  —  -)(1  —  5)  for  any  chosen  5  >  e,  and  sufficiently  large  n.  □ 

n32  ®  ® 

2.7  A  Remark  on  the  Proof  Technique 

We  make  a  remark  on  the  proof  techniques  that  are  used  in  Chapter  3  and  Chapter  4.  It  is 
to  be  noted  that  many  of  the  intermediate  lemmas  in  the  proofs  are  conditioned  on  certain 
desirable  events  proved  to  occur  w.h.p.  in  prior  lemmas.  Let  a  generic  undesirable  event 


Pr[V  <  <  Pr[X  <  ^^^\£i]  <  <  exp(- 


log' 

32 
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be  denoted  by  £i  (i.e.,  -if*  is  the  desirable  event).  Note  that  the  following  is  always  true: 


Pr[£:i  U  £12]  =  Pr[£:i]  +  Pr[-£:i]  Prf^sh^^i]  <  Pr[£:i]  +  Vt[S2\^£i]  (2.4) 

In  light  of  this,  it  is  not  hard  to  see  that  the  probability  that  even  one  of  the  undesirable 
events  from  any  of  these  lemmas  occurs,  can  be  upper-bounded  via  by  summing  up  the 
individual  (in  some  cases,  conditional)  probability  of  occurrence  of  each  undesirable  event, 
as  bounded  by  each  lemma  (i.e.,  by  essentially  applying  a  union  bound  on  the  possibly 
conditional  probabilities).  Since  a  proof  comprises  a  small  constant  number  of  lemmas, 
and  each  lemma  proves  that  the  (possibly  conditional  on  previous  lemmas)  probability  of 
occurrence  of  some  undesirable  event  goes  to  0  (or  equivalently  shows  that  the  probability 
of  occurrence  of  the  complementary  desirable  event  goes  to  1),  this  sum  will  also  go  to  zero. 
Hence,  the  probability  that  even  one  of  the  undesirable  events  happens  goes  to  0.  Where 
not  explicitly  stated,  this  union-bound  argument  is  implicitly  applied. 
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Chapter  3 


Adjacent 


(c,  /)  Assignment 


In  this  chapter,  we  present  capacity  results  for  the  adjacent  (c,  /)  assignment  model  that 
was  introduced  in  Chapter  2.  We  begin  by  defining  the  adjacent  (c, /)  assignment  model 
in  Section  3.1,  and  summarize  the  chapter  results  in  Section  3.2.  We  present  necessary 
and  sufficient  conditions  for  connectivity  in  Section  3.3.  Section  3.4  presents  an  upper 
bound  on  capacity.  A  capacity-achieving  lower  bound  construction  in  described  in  Section 
3.5.  In  Section  3.6,  we  show  how  our  results  for  adjacent  (c,  /)  assignment  can  be  used  to 
obtain  results  for  the  case  of  untuned  radios.  We  conclude  with  a  brief  discussion  on  the 
implications  of  the  capacity  result  in  Section  3.7. 


3.1  Model  Definition 


In  the  adjacent  (c, /)  model,  the  frequency  band  is  divided  into  c  channels  numbered  1,  2, 
...,  c  in  order  of  increasing  frequency,  but  an  individual  interface  can  only  use  /  channels 
(2  <  /  <  c).  Prior  to  deployment,  each  interface  is  assigned  a  block  location  i  uniformly 
at  random  from  1, ...,  c  —  /  1  and  thereafter  it  can  switch  between  the  set  i,  ...,i  +  f  —  1  . 

Thus,  the  probability  that  an  interface  is  assigned  block  location  i  (where  1  <  i  <  c  —  /  +  1) 

7=7+T- 

Since  channel  i  occurs  in  min{i,c  —  i  -|-  1, /,  c  —  /  +  1}  blocks,  and  each  block  has  a 
probability  ^-f+i  being  assigned: 


Pr[  a  given  interface  can  switch  on  channel  i\  =  {i) 


min{z,  c  —  i  -|-  1,  /,  c  —  /  +  1} 
c-  /  +  1 


(3.1) 

Since  we  consider  only  single-interface  nodes  for  the  results  in  this  chapter,  there  is 
a  one-to-one  mapping  between  interfaces  and  nodes.  Thus,  we  often  use  the  term  node 
instead  of  interface  in  the  following  discussion. 
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The  probability  that  a  node  with  block  location  i  can  operate  on  a  common  channel  (we 
often  refer  to  this  as  sharing  a  channel)  with  another  randomly  chosen  node  is  given  by: 

(l  +  min{i-l,/-l}  +  min{c-/  +  l-i,/-l}) 

PadjW  =  - c- f  +  1 - 

It  is  evident  that: 


min{ - ,  1}  <  padj  (i)  <  min{  ^  ,  1}  (3.3) 

C-/+1  C-/+1 

3.2  Summary  of  Results 

We  prove  the  following  results: 

1.  We  show  that  in  the  regime  c  =  O(logn),  the  critical  transmission  range  for  connec¬ 
tivity  with  adjacent  (c, /)  assignment  is  0( 

2.  We  establish  the  per-flow  capacity  under  adjacent  (c,  /)  assignment  for  the  regime 
c  =  O(logn)  as  ^{W ^ 

A  preliminary  version  of  the  chapter  results  was  reported  in  [7]. 

3.3  Conditions  for  Connectivity 

3.3.1  Necessary  Condition  for  Connectivity 

We  obtain  a  necessary  condition  for  connectivity  through  an  adaptation  of  the  proof  tech¬ 
niques  used  to  obtain  the  necessary  condition  for  connectivity  in  [42], 

Theorem  1.  With  an  adjacent  {c,  f)  channel  assignment  (when  c  =  Oilogn)),  if  p  = 
mini  ,  1}  and  7rr^(n)  =  ,  where  b  =  lim  b(n)  <  -boo  then: 

c  /-hi  pn  n— >oo 

lim  inf  Pr[  disconnection  ]  >  e~^(l  —  e~^) 

n— >oo 

where  by  disconnection  we  imply  the  event  that  there  is  a  partition  of  the  network. 

Proof.  We  present  a  proof-sketch  here.  The  detailed  proof  is  described  in  Appendix  B. 
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Given  that  a  node  has  block  location  i,  the  probability  that  it  can  operate  on  a  common 
channel  with  another  random  node  within  its  range  is  given  in  (3.3),  and  denoted  by  Padj{i)- 

Note  that  Padjii)  is  different  for  different  block  locations  i  primarily  because  nodes  with 
blocks  at  the  fringes  of  the  band  are  less  likely  to  share  channels  with  other  nodes.  Since 
we  are  deriving  a  necessary  condition  for  connectivity,  it  is  valid  to  make  the  following 
assumption  for  the  purpose  of  this  proof: 

Channel  pairs  (z,c  —  /  +  f  +  l),l  <  i  <  /  —  I  possess  magical  capabilities,  such  that 
communication  on  channel  i  ends  up  being  visible  on  channel  c  —  /  +  f  +  1,  and  vice- 
versa.  Thus,  if  a  node  has  channel  i,  then  it  can  also  communicate  with  a  node  that  does 
not  share  any  channel  with  it,  but  has  channel  c  —  f  +  i  +  1.  Another  way  to  view  this 
situation  is  that  although  nodes  are  assigned  channels  as  per  the  adjacent  (c,  /)  model, 
c  —  /  <  i  <  /  —  I  is  actually  an  alias  for  i.  Thus,  at  the  time  of  network  operation, 

a  node  having  channel  c  —  /  -|-i-|-l,l<f</  —  1  uses  channel  i  instead  (i.e.,  c  —  f  +  i  +  1 
serves  as  an  alias  for  i). 

Under  this  assumption,  V  i  :  Padjii)  =  min{  ^-f+i  >  network  is  disconnected 

under  this  assumption,  then  it  must  necessarily  be  so  otherwise.  This  can  be  argued  as 
follows;  suppose  we  are  given  a  network  instance  with  nodes  assigned  adjacent  channels  as 
per  the  adjacent  (c,  /)  model,  and  we  then  impose  the  assumption  stated  above.  Suppose 
this  new  network  is  disconnected.  Now  the  imposed  assumption  is  removed,  but  the  channel 
block  assigned  to  each  node  remains  unchanged.  Then,  in  the  new  scenario,  some  nodes  that 
were  earlier  able  to  communicate,  will  not  be  able  to  do  so  anymore;  however  those  nodes 
that  were  incapable  of  communicating  will  preserve  their  status  quo.  Hence,  a  necessary 
condition  for  the  hypothetical  network  would  remain  valid  even  in  the  actual  network. 

Therefore,  to  establish  a  necessary  condition  for  connectivity  with  adjacent  (c,  /)  as¬ 
signment,  we  estabslish  a  necessary  condition  for  connectivity  in  a  scenario  where  we  have 
the  additional  assumption  described  above.  This  proof  is  an  adaptation  of  a  similar  proof 
in  [42]  (Theorem  2.1  in  [42]). 

We  focus  on  the  disconnection  event  where  singleton  sets  are  partitioned  from  the  rest 
of  network.  Recall  that  p  =  min{  ,  !}•  When  /  >  p  =  1,  i.e.,  any  pair  of  nodes 
that  are  within  range  can  communicate  with  each  other  as  they  can  operate  on  at  least  one 
common  channel,  and  the  necessary  condition  result  from  [42]  applies  directly.  Thus,  we 
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need  to  consider  only  /  <  for  which  p  =  ■ 

The  probability  that  a  node  x  is  isolated,  i.e.,  cannot  communicate  with  any  node  is 
give  by  pi  =  (1  —  pirr"^ We  can  also  obtain  an  upper  bound  p2  on  the  probability 
that  two  nodes  x  and  y  are  both  isolated. 

When  lim  sup  b{n)  <  +oo,  it  can  be  shown  that; 

n— >oo 


Pr[  disconnection  ]  >  Pr[x  is  only  isolated  node] 

X 

>  Pr[x  isolated  ]  —  Pr[x  and  y  both  isolated  ] 

X  x^y 

>  -  (1  + 

for  any  0  <  1,  e  >  0,  and  large  n 


(3.4) 


Therefore,  if  lim  sup  6(n)  <  +oo,  the  network  is  asymptotically  disconnected  with 

n— >oo 

some  positive  probability.  The  detailed  proof  is  described  in  Appendix  B.  □ 


Corollary  1.  With  an  adjacent  {c,  f)  assignment,  the  critical  transmission  range  for  con¬ 
nectivity  in  the  regime  c  =  O(logn)  is 


Proof.  Whenever  /  >  p  =  1  <  ^  in  Theorem  1,  and  the  necessary  condition  require 
7rr^(n)  >  Whenever,  /  <  p  =  and  the  necessary  condition 

again  requires  that  7rr^(n)  >  Hence  with  adjacent  (c, /)  assignment,  connectivity 

requires  that  r(n)  =  □ 


3.3.2  Sufficient  Condition  for  Connectivity 

It  can  be  shown  that  having  r(n)  =  some  suitable  constant  ai,  suffices  to 

ensure  that  the  network  is  asymptotically  connected  w.h.p.  This  will  be  evident  from  our 
lower  bound  construction  for  capacity.  Therefore,  the  proof  is  not  presented  separately. 


3.4  Upper  Bound  on  Capacity 

We  proved  in  Theorem  1  that  to  avoid  isolated  nodes  r(n)  must  be  Then  by 

the  connectivity  constraint  mentioned  in  Section  2.5  of  Chapter  2,  the  per  flow  throughput 
is  limited  to 
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3.5  Lower  Bound  on  Capacity 


We  present  a  constructive  proof  that  achieves  a  per-flow  throughput  of  Q.{W y  log ^ ) ■  This 
construction  has  similarity  to  the  constructions  in  [43,  36,  65],  but  has  certain  distinctive 
features  that  stem  from  the  need  to  address  the  channel  switching  constraints. 

The  surface  of  the  unit  torus  is  divided  into  square  cells  of  area  a(n)  each.  The  trans¬ 
mission  range  r(n)  is  set  to  y^8a(n),  thereby  ensuring  that  any  node  in  a  given  cell  is  within 
range  of  any  other  node  in  any  adjoining  cell.  Since  we  utilize  the  Protocol  Model  [43],  a 
node  C  can  potentially  interfere  with  an  ongoing  transmission  from  node  A  to  node  B,  only 
A  BC  <  (1  -|-  A)r(n).  Thus,  a  transmission  by  A  in  a  given  cell  can  only  be  affected  by 
transmissions  in  cells  with  some  point  within  a  distance  (2  -|-  A)r(n)  from  it,  and  all  such 
cells  must  he  within  a  circle  of  radius  0((1  -|-  A)r(n)).  Since  A  is  independent  of  n,  the 
number  of  cells  that  interfere  with  a  given  cell  is  only  some  constant  (say  7). 

We  choose  a{n)  =  (and  hence  r(n)  = 

The  following  result  follows  from  an  application  of  Lemma  59: 

Lemma  3.  The  number  of  nodes  in  any  cell  lies  between  and  yjith  proba¬ 

bility  at  least  1  — 

Definition  1.  (Preferred  Channels)  Channels  i  for  which  p1^\i)  >  ^  are  deemed  preferred 
channels. 

For  any  set  of  /  contiguous  channels,  at  least  [|]  of  the  channels  have  {i)  > 
Hence,  each  node  can  switch  on  x  >  [|]  >  ^  preferred  channels.  Also  note  that  non¬ 
preferred  channels  only  occur  at  the  fringes  of  the  frequency  band. 

Lemma  4.  If  there  are  at  least  nodes  in  every  cell  TL,  then  at  least  12  log  n  nodes 

in  each  cell  are  capable  of  switching  on  each  of  the  preferred  channels,  with  probability  at 
least  1  —  qi,  where  qi  =  O(^). 

Proof.  Let  us  consider  one  particular  cell  7i,  with  xn  >  nodes.  Let  Xij  =  1  if  node 

j  can  switch  on  preferred  channel  i,  and  0  else.  Pi[Xij  =  1]  =  ps'^^{i)  >  since  i  is  a 

preferred  channel.  For  a  given  i,  all  the  Aj^’s  are  independent.  Let  Aj  =  ^  Xij.  Then: 

i£H 

E[Xi]  =  pf\i)xn  >  25  log  n 
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Applying  the  Chernoff  bound  in  Lemma  53  (setting  /3  =  ^)  ,  we  obtain; 

25  251oen  1 

Fr[Xi  <  12  log n]  <  Pr[Ai  <  —  logn]  <  exp( - - - )  <  ^  (3.5) 

z  a  s 

The  number  of  preferred  channels  is  at  most  c  =  O(logn).  Application  of  the  union  bound 
over  all  such  channels  yields: 

c  ^ 

Pr[A,  <  25 logn  for  any  preferred  j]  <  =  0{ — 

n~s  n~^ 

Since  there  are  <  n  cells,  another  application  of  the  union  bound  yields: 

Pr[Xj  <  12 logn  in  any  cell  ]  =  0(\)  (3.6) 

□ 

Each  cell  indeed  has  at  least  nodes  w.h.p.  (Lemma  3).  Thus,  a  union  bound 

argument  (as  was  explained  in  Section  2.7)  can  be  invoked  to  show  that  each  cell  has  at 
least  12  log  n  nodes  on  every  preferred  channel  w.h.p. 

Lemma  5.  If  there  are  at  least  nodes  in  every  cellTt,  then,  for  all  adjacent  preferred 

channels  i  and  i  +  1,  there  are  at  least  12 logn  nodes  in  each  cell  capable  of  switching  on 
both  channels  i  and  i  +  1,  with  probability  at  least  1  —  q2,  where  q2  =  O(^). 

Proof.  Let  us  consider  one  particular  cell  7i  with  xn  nodes,  where  x-n  > 

Xij  =  1  if  node  j  can  switch  on  both  channel  i  and  i  +  1  (where  both  i  and  i  +  1  are 

preferred),  and  0  else.  For  a  given  i,  all  the  Aj^’s  are  independent. 

Then  Pr[Xij  =  1]  >  ^i|L  >  Let  A*  =  X)  A^.  Then  E[Xi]  >  251ogn.  By 

j&H 

application  of  the  Chernoff  bound  from  Lemma  53  (with  /3  =  1)  ,  we  obtain: 

25  251oe’r?  1 

Pr[Ai  <  12 logn]  <  Pr[Ai  <  —  logn]  <  exp( - - - )  <  ^  (3.7) 

Z  o  fi  s 

i  cannot  take  more  than  c  —  1  distinct  values,  and  c  —  1  =  O(logn).  By  taking  a  union 
bound  over  all  such  possibilities,  we  obtain  that  Pr[Aj  <  12 logn  for  any  preferred  i,  i+1]  < 
^^^5^  =  O(^^l^)-  Since  there  are  <  n  cells,  another  application  of  the  union 


23 


bound  yields: 


<  12 log n  in  any  cell]  =  O(^)  (3.8) 

□ 

From  Lemma  3,  each  cell  has  at  least  nodes  w.h.p.  Thus,  each  cell  has  at  least 

12  log  n  nodes  on  every  pair  of  adjacent  preferred  channels  {i,i  +  1)  w.h.p. 

Lemma  6.  If  there  are  at  least  nodes  in  every  cell,  and  if  i  and  i  +  x  are  both 

preferred  channels,  where  x  <  then  there  are  at  least  12  log  n  nodes  in  the  cell  capable 
of  switching  on  both  channels  i  and  i  +  x  with  probability  at  least  1  —  qs,  where  qs  =  O(^). 

Proof.  Note  that  since  i  is  preferred,  it  follows  that  i  >  [^] .  A  node  can  switch  on  both  i 
and  i  +  X  if  its  block  location  lies  between  max{l,i  +  x  —  /  +  1}  and  i.  This  probability 
is  .  Since  x  <  [|j,  this  probability  is  at  least  ^-J+i  —  Thereafter  the  proof 

argument  is  the  same  as  that  of  Lemma  5.  □ 

3.5.1  Routing 

We  denote  the  source  of  a  flow  by  5,  the  pseudo-destination  by  D' ,  and  the  actual  desti¬ 
nation  by  D.  We  begin  by  briefly  summarizing  the  routing  strategy  used  in  [43].  In  [43], 
one  node  in  each  cell  was  designated  the  relay  for  all  routes  traversing  that  cell  but  not 
originating/terminating  in  it;  a  flow’s  route  traversed  the  cells  intersected  by  the  straight 
line  SD'  (i.e.,  they  were  relayed  through  the  assigned  relay  nodes  in  the  sequence  of  cells 
intersected  by  the  straight-line  SD')  and  thereafter  needed  to  take  at  most  one  extra-hop 
to  reach  the  actual  destination  D,  which  necessarily  lay  either  in  the  same  cell  as  D'  or  in 
one  of  the  8  adjacent  cells. 

Lemma  7.  The  number  of  straight-line  SD'D  routes  that  traverse  any  cell  is  0{n^,/a  (n)). 

Proof.  From  Lemma  61  (Appendix  E)  we  know  that  the  number  of  SD'  straight-lines 
traversing  a  single  cell  are  0{n^,/afn)).  We  must  now  consider  the  number  of  routes  whose 
last  D'D  hop  may  enter  this  cell.  If  D  is  in  the  same  cell  as  D' ,  there  is  no  extra  hop. 
Otherwise,  the  number  of  flows  for  which  D'  lies  in  one  of  the  8  adjacent  cells  is  0{na{n)) 
w.h.p.  (since  applying  Lemma  59  to  the  set  of  n  pseudo-destinations)  yields  that  the  number 
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of  pseudo-destinations  in  any  cell  is  0{na{n))  w.h.p.).  Since  na{n)  =  a{n)),  the  total 

number  of  traversing  routes  is  0{n-\/a  (n)).  □ 

Hereafter,  we  shall  refer  to  this  routing  strategy  as  straight-line  routing,  since  it  basically 
comprises  a  straight-line  except  for  the  last  hop. 

If  there  were  no  constraints  on  channel  switching,  one  could  envision  determining  the 
cells  that  a  route  should  traverse  using  a  routing  strategy  similar  to  that  in  [43].  We  do 
remark  that,  even  in  the  absence  of  switching  constraints,  in  a  multi-channel  network  with 
c  channels  where  each  node  has  fewer  than  c  interfaces,  it  does  not  suffice  to  designate  a 
single  relay  node  in  each  cell,  as  multiple  nodes  must  be  concurrently  active  within  a  cell 
to  harness  the  available  bandwidth  (see  [65]). 

In  the  presence  of  the  switching  constraints  imposed  by  the  adjacent  (c,  /)  assignment,  a 
feasible  route  must  comprise  more  than  just  a  sequence  of  nodes  from  source  to  destination 
such  that  consecutive  nodes  are  with  range  of  each  other.  Rather,  a  feasible  route  must 
comprise  a  sequence  of  nodes  vq  =  5,  ui, ...,  Vk,  v^+i  =  D  such  that  for  all  0  <  f  <  /c:  (1)  Vi 
and  Uj+i  are  with  range  of  each  other  (2)  they  can  operate  on  some  common  channel. 

To  be  able  to  find  such  a  feasible  route  w.h.p.,  the  route  of  a  flow  may  need  to  traverse 
a  certain  minimum  number  of  intermediate  nodes  (i.e.,  a  feasible  sequence  of  nodes  leading 
from  uo  =  5"  to  v^+i  =  D  must  have  a  certain  minimum  number  of  nodes).  We  elaborate 
further: 

We  begin  by  observing  that  the  source  must  transmit  on  one  of  the  channels  that  its 
interface  can  switch  on.  Similarly,  the  destination  must  receive  on  one  of  the  channels 
that  its  interface  can  switch  on.  Suppose  the  source  uses  channel  I  to  transmit,  and  the 
destination  chooses  to  use  channel  r  to  receive; 

We  assume  w.l.o.g.  that  I  <  r.  Suppose  r  —  I  =  k'[^\  +  m  {0  <  m  <  [|j).  Thus 
k'  =  From  the  model,  and  the  definition  of  a  preferred 

channel,  it  follows  that,  given  two  preferred  channels  I  and  r  all  channels  I  <  i  <  r  must 
also  necessarily  be  preferred.  In  light  of  this,  using  the  result  proved  in  Lemma  6,  one  can 
see  that  it  is  always  possible  to  transition  from  /  to  r  in  at  most  k'  +  1  <  y  +  1  steps: 
I  — — >  I  I  k^  \_=^\  — >  I  k^  \  m  =  r . 

More  specifically,  we  can  find  a  sequence  of  nodes  vq  =  S,vi,V2,  ■■■Vk',Vk'+i  =  D  such 
that  Vo  and  vi  both  can  operate  on  channel  I,  vi  and  V2  can  both  operate  on  channel  ^  +  [  2  J 
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and  so  on.^  It  is  also  evident  that  a  sequence  of  nodes  that  allows  a  transition  from  channel 
I  to  channel  r  must  comprise  at  least  nodes. 

More  generally,  we  can  try  to  find  a  feasible  route  which  comprises  a  sequence  of  nodes 
Vo  =  5,  ui,  ...Vi, Vi+kz+i.-.-Ufe  =  D,  such  that  (1)  vo,---,Vi  can  all  operate  on  channel  I, 
(2)  for  alH  <  m  <  i  +  k':  Vm  and  Vm+i  can  both  operate  on  channel  I  +  ,  (3)  Uj+fc/ 

and  Vi+k'+i  can  both  operate  on  channel  r,  and  (4)  Vi+k'+2,  ■■■,Vk  can  all  operate  on  r.  The 
subsequence  Vi,  ...,Vi^k'+i  comprises  the  transition  sequence  in  this  route.  Links  on  this 
route  that  lie  before  the  transition  sequence  use  the  source  channel  I  to  transmit  the  flow’s 
packets,  while  links  that  lie  after  the  transition  sequence  use  the  destination  channel  r. 
Links  {vi+x-i,Vi+x),  1  <  x  <  k'  in  the  transition  sequence  use  channel  I  +  x[|j  for  x  <  k' 
and  link  Uj+fc/, uses  channel  r. 

From  the  above  it  is  evident  that  a  feasible  route  must  comprise  a  certain  minimum 
number  of  intermediate  relay  nodes,  i.e.,  must  traverse  a  certain  mimumum  number  of 
hops. 

We  now  address  the  issue  of  how  the  channels  I  and  r  are  chosen  by  S  and  D  respectively. 

Channel  Selection  and  Transition  Initially,  after  each  source  has  chosen  a  random 
destination,  the  each  flow  is  assigned  an  initial  source  channel,  as  well  as  a  target  destination 
channel  in  the  following  manner: 

The  source  5  of  a  flow  has  an  assigned  contiguous  channel-set  (say  {i,  ...,i  +  f  —  1)), 
while  the  destination  D  also  has  an  assigned  contiguous  channel-set  (say  {j,  ...,j  +  f  —  1)). 
One  of  the  x  >  ^  preferred  channels  available  at  the  source  is  selected  uniformly  at  random 
as  the  source  channel.  One  of  the  y  >  2  P^e/erred  channels  available  at  the  destination  is 
selected  as  the  channel  on  which  the  flow  reaches  the  destination.  The  choice  of  destination 
channel  can  be  made  using  any  arbitrary  criterion  from  amongst  all  preferred  channels  that 
the  destination  can  operate  on. 

To  ensure  that  each  route  has  enough  hops  to  assure  a  feasible  transition  sequence,  we 
stipulate  that  the  straight-line  cell-to-cell  path  be  followed  if  either  the  chosen  source  and 
destination  channels  are  the  same,  or  if  the  straight-line  segment  SD'  comprises  L  >  y 
intermediate  hops.  If  S  and  D'  (hence  also  D)  lie  close  to  each  other,  the  hop-length  of 
the  straight  line  cell-to-cell  path  can  be  much  smaller.  In  this  case,  a  longer  detour  path 
^When  I  >  r,  the  transitions  are  of  the  form  I  ^  I  — 
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Figure  3.1:  Illustration  of  detour  routing 


is  chosen.  Consider  a  circle  of  radius  yr(n)  centered  at  S.  Choose  a  point  on  this  circle, 
say  P.  In  the  considered  c  =  O(logn)  regime,  P  can  be  any  point  on  the  circle.  The  route 
is  obtained  by  traversing  cells  along  SP  and  then  PD' D.  This  ensures  that  the  route  has 
at  least  the  minimum  required  hop-length  (provided  by  segment  SP).  This  situation  is 
illustrated  in  Fig.  3.1.  Flows  that  follow  such  a  detour  route  shall  hereafter  be  referred  to 
as  detour-routed  flows,  whereas  the  remaining  flows  (which  follow  a  straight-line  route)  will 
be  referred  to  as  non-detour-routed  flows. 

The  route  of  a  flow  comprises  two  phases:  a  progress-on-source-channel  phase,  and 
a  transition  phase.  Intuitively,  while  in  the  progress-on-souree-ehannel  phase,  the  flow’s 
packets  are  transmitted  at  each  hop  on  the  chosen  source-channel  1.  Once  in  the  transition 
phase,  the  packets  get  transmitted  along  a  sequence  of  channels  that  constitute  a  transition 
from  I  to  r,  as  was  described  earlier.  Once  the  transition  sequence  has  reached  channel  r, 
the  packets  are  transmitted  along  any  remaining  hops  on  r,  till  they  are  received  at  the 
destination. 

The  initial  hops  of  the  route  of  a  non-detour- routed  flow  constitute  the  progress- on- 
source- channel  phase.  The  flow  remains  in  this  phase  till  there  are  only  [ y]  intermediate 
hops  left  to  the  destination.  At  this  point,  it  enters  transition  mode.  A  detour-routed  flow 
is  always  in  transition  mode. 

Lemma  8.  Suppose  the  event  addressed  in  Lemma  6  holds.  Suppose  a  flow’s  selected 
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preferred  source  channel  is  I  and  its  selected  preferred  destination  channel  is  r.  Then,  after 
having  traversed  h>  [y]  +  1  cells  (recall  that  2  <  f  <  c)  ,  it  is  guaranteed  to  have  made 
the  transition. 

Proof.  The  event  considered  in  Lemma  6  is  that  each  cell  has  at  least  12  log  n  nodes  on 
each  pair  of  preferred  channels  (i,  x),  for  all  x  <  .  Given  that  the  chosen  source  channel 

is  I,  the  flow  packets  are  transmitted  on  I  on  those  hops  where  the  flow  is  in  progress- 
on-source-channel  mode.  When  the  flow  moves  into  transition  mode,  the  first  relay  node 
in  this  phase  chooses  as  next-hop  a  node  having  channel  pair  (1,1  +  [|j)  in  the  next  cell 
(the  exact  method  for  choosing  relay  nodes  is  described  later),  and  transmits  the  flow’s 
packets  to  it  using  channel  1.  This  node  then  chooses  a  next  hop  having  channel  pair 
(I  +  [|j ,  I  -\-  2[|j ),  and  sends  packets  to  it  over  channel  I  -\-  [|j ,  and  the  process  continues 
till  the  flow  has  found  a  transition  into  the  chosen  destination  channel  r.  This  requires  at 
most  [ y]  intermediate  hops,  which  are  obtained  by  traversing  at  most  [y]  +  1  cells.  Once 
the  transition  to  destination  channel  r  is  done,  flow  packets  are  transmitted  on  channel  r 
for  the  remaining  hops  (if  any)  to  the  destination.  □ 

The  event  considered  in  Lemma  6  holds  w.h.p.,  and  therefore,  each  flow  will  be  able  to 
find  such  a  transition  sequence  w.h.p. 

Lemma  9.  If  the  number  of  distinct  flows  traversing  any  cell  is  x  in  case  of  pure  straight- 

line  routing,  it  is  at  most  x  +  0{nj^r‘^(n))  x  +  O(log^n)  even  with  detour  routing 

2 

Proof.  Since  a  detour  route  lies  within  a  circle  of  radius  yr(n)  around  the  source,  the  extra 
detour-routed  flows  that  may  possibly  pass  through  a  cell  (compared  to  the  case  where 
only  straight-line  routing  is  performed)  are  those  whose  sources  lie  within  a  distance  yr(n) 
from  this  cell.  All  such  possible  sources  fall  within  a  circle  of  radius  (y  -|-  l)r(n),  and  hence 
area  ac(n)  =  7r(y  -|-  l)^r(n)^.  Any  circle  of  this  radius  has  0{nac{n))  nodes,  and  hence 
at  most  0(nac(n))  sources  w.h.p.  (Lemma  60).  Therefore,  the  number  of  detour-routed 
flows  that  traverse  the  cell  is  Ofnafln))  =  0{n^r‘^(n)),  and  the  total  number  of  flows  is 
X -\- 0(n^r‘^(n))  x -|- 0(log^  n)  w.h.p.  □ 

^This  is  a  loose  upper  bound.  The  actual  number  of  detour-routed  flows  traversing  a  cell  is  much  smaller. 
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Lemma  10.  The  number  of  flow-links  traversing  any  cell  in  transition  phase  (counting 
repeat  traversals  separately)  is  O(log^n)  w.h.p. 

Proof.  First  let  us  account  for  the  SD'  stretch  of  each  flow’s  route,  without  considering  the 
possible  additional  last  hop.  We  account  for  it  explicitly  later  in  this  proof. 

By  our  construction,  a  non-detour  routed  flow  enters  the  transition  phase  only  when 
it  is  [y]  intermediate  hops  away  from  its  destination.  All  such  flows  must  have  their 
pseudo-destinations  within  a  circle  of  radius  Q{jr{n))  centered  in  the  cell.  The  number  of 
destinations  that  lie  within  a  circle  of  radius  ©(j)r(n)  from  the  cell  is  0(n(  j)^r^(n)) 

3  3 

logn)  w.h.p.,  (by  suitable  choice  of  a{n)  =  0{j^)  in  Lemma  60).  Thus  the  number  of 
non-detour  routed  flows  that  may  traverse  a  cell  is  0(j3  logn). 

A  detour-routed  flow  is  always  in  transition  phase.  By  Lemma  9,  there  are  0(log^  n) 
such  flows  traversing  any  cell.  Each  such  flow  can  only  traverse  a  cell  at  most  twice  along 
the  SPD'  stretch.  This  yields  O(log^n)  detour-routed  flows  (including  repeat  traversals). 

Also,  the  cell  may  be  traversed/re-traversed  by  some  flows  on  their  additional  last  hop. 
There  are  0(na(n))  pseudo-destinations  in  the  adjacent  cells  w.h.p.,  and  thus  0(na(n))  = 
Q(cic^n)  O(log^n)  such  last  hop  flow  traversals.  Thus  the  number  of  flows  transi¬ 

tioning  in  any  cell  is  0(  logn))  -|-  0(log^  n)  +  0(log^  n).  Taking  note  that  c  =  0(log n), 
it  follows  that  the  number  of  flows  traversing  the  cell  while  in  their  transition  phase  is 
O(log^n)  w.h.p.  □ 

Relay  Node  Selection  We  now  describe  how  a  relay  node  is  assigned  to  a  flow’s  route 
in  each  cell. 

A  flow-link  is  said  to  enter  a  cell  on  a  channel  j  if  the  flow’s  route  includes  a  hop  (link) 
(uj_i,Ui),  where  Vi-i  is  in  a  cell  adjacent  to  TL,  Vi  is  in  ,  and  Vi-i  transmits  the  flow’s 
packets  to  Vi  using  channel  j  (this  naturally  implies  that  both  Vi-i  and  Vi  can  operate  on 
channel  j).  Similarly,  a  flow- link  is  said  to  leave  a  cell  7i  on  channel  j  if  the  route  includes 
a  link  (uj,  u^+i),  where  Vi  is  in  7i,  is  in  a  cell  adjacent  to  7i,  and  Vi  transmits  the  flow’s 
packets  to  Uj+i  using  channel  i 

When  a  flow- link  must  enter  a  cell  in  progress-on-souree-channel  phase  on  a  certain 
channel,  then,  amongst  all  nodes  in  that  cell  capable  of  switching  on  that  channel,  it  is 
assigned  to  the  node  which  has  the  least  number  of  flow-links  entering  on  that  channel 
assigned  to  it  so  far.  In  the  transition  phase  of  a  flow,  a  flow-link  may  need  to  be  assigned 
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a  relay  node  that  can  operate  on  a  specific  pair  of  channels  (to  facilitate  transition) .  It  can 
be  assigned  to  any  node  in  the  cell  that  satisfies  the  requirement.  Similarly,  once  a  flow  in 
transition  phase  has  already  completed  to  the  transition,  the  remaining  links  on  the  route 
will  enter  the  remaining  cells  on  its  route  on  the  destination  channel.  Such  a  flow-link  can 
be  assigned  to  any  node  in  the  cell  that  can  switch  on  the  destination  channel. 


3.5.2  Load  Balance  within  a  Cell 

Recall  that  each  cell  has  0{na{n))  nodes  w.h.p.,  and  0{n-\/ a(n))  flows  traversing  it  w.h.p. 


Per-Channel  Load 


Lemma  11.  The  number  of  flow-links  that  enter  any  cell  on  any  single  channel  is  0{ 
w.h.p. 


) 


Proof.  Consider  a  cell  7i. 

A  flow’s  route  may  enter  a  channel  i  in  the  cell  in  any  of  the  following  circumstances: 


1.  The  flow’s  source  channel  is  i  and  it  is  in  the  progress-on-source-channel  phase 


2.  The  flow’s  route  is  in  the  transition  phase,  and  transitioning  through  i 

3.  The  flow’s  route  is  in  the  transition  phase,  its  destination  channel  is  i,  and  it  has 
already  made  a  transition 

We  first  account  for  the  flow-routes  in  progress-on-source-channel  phase: 

From  our  construction,  and  by  our  choice  of  a(n),  each  flow  stays  in  progress-on-source- 
channel  phase,  till  there  are  [ y]  intermediate  hops  left  to  the  destination.  Thus,  a  flow  is 
on  its  source  channel  in  a  given  cell  if  its  destination  is  more  than  [y]  intermediate  hops 
away. 

Denote  the  number  of  flow-routes  traversing  the  cell  in  progress-on-source  channel  phase 
by  m.  Then  m  =  0{n^^ a{n))  (from  Lemma  7). 

Let  Xij  be  an  indicator  variable  which  is  1  if  flow-route  j  enters  the  cell  on  channel  i, 
and  is  0  else. 

From  the  model  definition,  each  source’s  interface  is  assigned  a  block  of  /  contiguous 
channels  in  an  i.i.d.  manner,  and  it  chooses  one  channel  uniformly  from  x  >  ^ 
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preferred  channels  in  this  channel  block.  Furthermore,  the  sequence  of  cells  traversed  by  a 
flow’s  route  is  chosen  in  a  manner  independent  of  the  channels  the  source  can  switch  on. 


Hence,  the  probability  that  a  flow-route  in  progress-on-source-channel  phase  is  on  a  par¬ 
ticular  pre/erred  channeli  is  at  most  ~  (f)  ^  ^  ^min{/,c-/-i-i} 

n,ax{/,c-/+i}  ^  t  yielding; 


i  <  PrlX.j  =  1|  <  ^ 

Xi  =  Xij  denotes  the  number  of  flow-routes  in  progress-on-source-channel  phase 
that  enter  the  cell  on  channel  i.  Evidently: 


^  <  E[Xi]  < 


4m 

c 


The  Xi^'s  are  i.i.d.  random  variables  for  a  given  i,  as  each  flow’s  source  channel  is 
chosen  in  an  i.i.d.  manner  (though  they  may  not  be  independent  for  different  i,  since 
Xij  =  1  Xik  =  0  V/c  /  i).  Hence  we  may  set  (1  -)-  (3)E[Xi\  =  max{ ,  3  log n}  (note 
that  /?  >  —  1  >  0),  and  apply  the  Chernoff  bound  from  Lemma  51  to  obtain: 


46^771 

Pr[Xi  >  max{ - ,31ogn}]  < 


E[Xi] 


(1  + 


/3)(l+/3)  ) 


< 


< 


e 

(1  -I-  P) 


(i+g)E[Xi 


eE[Xi 


(l+p)E[Xi] 


(3.9) 


max{^,31ogn} 


4p^?7? 

<  exp(— (1  -|-  P)E[Xi])  <  exp(—  max{ - ,  31ogn}) 

c 


The  number  of  preferred  channels  Cpref  cannot  exceed  c.  Applying  the  union  bound  over 
the  Cpref  <  c  preferred  channels,  the  probability  that  there  are  max{  ,  3  log  n}  or  more 
flow- links  entering  on  any  single  channel  is  at  most  cexp(— max{^^^, 3 logn}).  Taking 
another  union  bound  over  all  ^  cells,  the  probability  this  happens  in  any  cell 

of  the  network  is  less  than  loo^iog^  exp(—  max{  ,  3  log  n})  =  O(^). 

Observing  that  max{  ,  3  log  n}  =  q("''\/°(l1),  proves  that  the  number  of  non¬ 
transitioning  flows  that  enter  any  cell  on  a  given  channel  is  (9(!1VtM)  w.h.p. 

We  now  need  to  account  for  the  fact  that  some  of  the  flow-routes  may  be  in  the  transi- 
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tion  phase,  and  may  either  be  transitioning  through  an  intermediate  channel  or  may  have 
transitioned  to  the  destination  channel.  From  Lemma  10  the  number  of  flow-links  for  such 
flows  which  traverse  the  cell  (counting  repeat  traversals  separately)  is  0(log^  n)  w.h.p.  Even 
if  they  were  all  to  enter  on  the  same  channel,  the  additional  contribution  to  the  load  would 
be  0{log*  n). 

Hence  the  per-channel  load  in  all  cells  is  at  most  -|-0(log^  n) 

w.h.p.  □ 

Lemma  12.  The  number  of  flow-links  that  leave  any  given  cell  on  any  single  channel  is 
— )  w.h.p. 

Proof.  The  flows  whose  routes  leave  a  cell  fall  into  two  categories:  (1)  those  that  originate 
at  some  node  in  the  cell,  and  (2)  those  that  entered  the  cell  but  did  not  terminate  there 
(i.e.,  were  relayed  through  the  cell).  The  former  can  be  no  more  than  the  number  of  nodes 
in  the  cell,  i.e.  Q{na{n))  =  @(‘^'°s’^)  =  O(log^n).  For  the  latter,  note  that  any  flow-link 
that  leaves  the  cell,  must  then  enter  one  of  the  8  adjacent  cells.  Thus,  the  former  can  be  no 
more  than  8  times  the  maximum  number  of  flow-links  entering  a  cell  on  any  one  channel, 
which  has  been  established  as  in  Lemma  11.  Hence,  the  total 

number  of  flow-links  leaving  any  given  cell  on  a  given  channel  is  w.h.p.  □ 

Per-Node  Load 

Lemma  13.  The  number  of  flow- links  that  are  assigned  to  any  single  node  in  any  cell  is 
— )  w.h.p. 

Proof.  A  node  is  always  assigned  an  outgoing  link  for  the  single  flow  for  which  it  is  the 
source.  A  node  is  also  assigned  an  incoming  link  for  each  flow  for  which  it  is  the  destination 
(any  such  flows  terminate  in  that  cell),  and  there  are  0(log  n)  such  flows  for  any  node  w.h.p. 
(from  Lemma  1). 

Additionally,  a  node  may  act  as  a  relay  node  on  the  routes  of  other  flows.  For  each  such 
flow,  it  is  assigned  an  incoming  and  an  outgoing  link  (as  it  must  receive  the  flow’s  packets, 
and  then  transmit  them  on  to  a  next  hop  node). 

It  may  be  assigned  as  relay  for  some  flow-routes  that  are  in  the  transition  phase,  and  for 
which  it  serves  as  one  of  the  nodes  in  the  channel-transition  sequence,  or  it  may  be  assigned 
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as  relay  for  some  flow-routes  in  transition  phase  which  have  completed  the  transition,  if  it 
can  operate  on  their  destination  channel.  From  Lemma  10  there  are  O(log^n)  such  flow- 
links  traversing  a  cell  w.h.p.  (counting  possible  repeat  traversal  by  some  detour-routed 
flows,  as  well  as  any  additional  last  hop  traversals  separately).  Resultantly,  the  number  of 
such  flow-links  assigned  to  a  node  is  O(log^n). 

It  may  also  be  assigned  as  relay  for  flow- routes  that  are  in  progress-on-source-channel 
phase  while  they  traverse  the  cell.  We  have  already  established  in  Lemma  11,  that  the 
number  of  flows  that  enter  on  a  given  channel  in  any  cell  is  w.h.p.  From  our 

routing  and  channel  transition  strategy,  flow- links  in  the  progress-on-source-channel  phase 
of  a  route  are  always  operated  on  the  source’s  selected  preferred  channel.  From  Lemma  4, 
there  are  at  least  12  log  n  nodes  on  each  preferred  channel  in  each  cell  w.h.p.  As  per  our 
previously  described  relay  node  selection  strategy,  when  a  relay  node  is  to  be  assigned  to  an 
incoming  flow-link  in  progress-on-source-channel  phase  in  a  cell  on  a  certain  channel,  then 
amongst  all  nodes  in  the  cell  capable  of  switching  on  that  channel,  it  is  assigned  to  the  node 
which  has  the  least  number  of  entering  flow-links  assigned  on  that  channel  so  far.  By  using 
such  an  assignment  strategy,  it  follows  that  no  node  can  have  more  than  such 

flow- links  assigned  on  any  single  channel,  and  no  more  than 
such  flow-links  assigned  overall  (recall  that  c  =  O(logn),  and  /  <  c). 

For  each  incoming  flow-link  assigned  to  a  node  for  relaying,  there  is  a  corresponding 
outgoing  flow- link  (as  the  node  is  a  relay). 

Thus,  the  resultant  number  of  assigned  flows  per  node  is  1  -|-  O(logn)  -|-  O(log^n)  -|- 

3.5.3  Transmission  Schedule 

We  noted  earlier  that  each  cell  can  face  interference  from  at  most  a  constant  number  7  of 
nearby  cells,  where  7  is  a  constant.  The  resultant  cell- interference  graph  has  a  chromatic 
number  at  most  1  -|-  7.  Therefore,  it  is  possible  to  obtain  a  global  interference-free  TDMA 
schedule  having  1  -|-  7  time  slots  in  each  round.  In  any  slot,  if  a  cell  is  active,  then  all  cells 
that  interfere  with  it  are  inactive.  The  next  issue  is  that  of  intra-cell  scheduling.  We  need 
to  schedule  transmissions  so  as  to  ensure  that,  at  any  time  instant,  there  is  at  most  one 
transmission  on  any  given  channel  in  the  cell.  Besides,  we  also  need  to  ensure  that  no  node 
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is  expected  to  transmit  or  receive  more  than  one  packet  at  any  time  instant.  We  use  the 
following  procedure  to  obtain  an  intra-cell  schedule: 

We  construct  a  conflict  graph  based  on  the  nodes  in  the  active  cell,  and  its  adjacent 
cells  ,  as  follows: 

We  create  a  separate  vertex  for  each  flow-link  leaving  the  cell  (note  that  the  hop-sender 
of  each  such  flow-link  shall  lie  in  the  active  cell,  and  the  hop-receiver  shall  lie  in  one  of 
the  adjacent  cells).  Since  the  flow-link  operates  on  an  assigned  channel,  each  vertex  in  the 
graph  has  an  implicit  associated  channel.  Besides,  each  vertex  has  an  associated  pair  of 
nodes  corresponding  to  the  hop  endpoints.  Two  vertices  are  connected  by  an  edge  if  either 
(1)  they  have  the  same  associated  channel,  or  (2)  at  least  one  of  their  associated  nodes  is 
the  same. 

The  scheduling  problem  reduces  to  obtaining  a  vertex-coloring  of  this  graph.  If  we  have  a 
vertex  coloring,  then  it  ensures  that  (1)  a  node  is  never  simultaneously  sending/receiving  for 
more  than  one  flow-link  (2)  no  two  flow-links  on  the  same  channel  are  active  simultaneously. 
The  number  of  neighbors  of  a  graph  vertex  is  upper  bounded  by  the  number  of  flow-links 
leaving  the  active  cell  on  that  channel,  and  the  number  of  flow-links  assigned  to  the  flow’s 
two  hop  endpoints  (both  hop-sender  and  hop- receiver) .  From  Lemma  12  and  Lemma  13, 
the  degree  of  the  conflict  graph  is  Since  any 

graph  with  maximum  degree  d  has  chromatic  number  at  most  d+1,  the  conflict  graph  can 
be  colored  in  colors. 

j— — -  /cnlog^ 

Therefore,  the  cell-slot  can  be  divided  into  — )  =  0{- — ^ — )  equal  length  sub¬ 

slots,  and  all  flow-links  get  a  sub-slot  for  transmission. 

This  yields  that  each  flow  will  get  Q.{W y^niogn)  throughput. 

Combining  this  with  the  upper  bound  from  Section  3.4,  we  obtain  the  following  theorem: 

Theorem  2.  With  an  adjacent  {c,  f)-channel  assignment,  where  c  =  O(logn),  the  network 
capacity  is  per  flow. 

3.6  The  Case  of  Untuned  Radios 

It  was  proposed  in  [93]  that  extremely  inexpensive  wireless  devices  can  be  manufactured 
if  it  is  possible  to  handle  untuned  radios  whose  operating  frequency  may  lie  randomly 
within  some  band.  A  random  network  coding  based  approach  was  described  in  [93]  to  relay 
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Center  frequency  (uniformly  distributed  over  (Fi,F2) 


B 

Figure  3.2;  The  Untuned  Radio  Model 

information  between  a  single  source-destination  pair  using  such  devices  as  relays.  In  this 
section,  we  obtain  capacity  results  for  a  randomly  deployed  network  of  n  nodes  with  one 
untuned  radio  each,  with  our  assumed  model,  i.e.,  n  random  source-destination  pairs,  and 
store-and-forward  routing. 

The  untuned  channel  model  is  as  follows:  each  node  possesses  a  transceiver  with  center 
frequency  uniformly  distributed  in  the  range  {Fi,F2),  and  admits  a  spectral  bandwidth  B 
(Fig.  3.2).  Let  c  =  J  •  Then  c  is  the  maximum  number  of  disjoint  channels  that  could 

be  possibly  obtained  if  each  channel  occupied  a  frequency  band  of  width  B.  For  simplicity, 
the  rest  of  the  discussion  assumes  that  c  =  [T2-U  j  _  F2-F1  interval  (^1,^2)  is 

chosen  to  be  a  multiple  of  B). 

However  the  channels  of  operation  of  these  radios  are  untuned  and  hence  partially 
ovelapping,  rather  than  disjoint.  As  per  the  assumption  in  [93],  two  nodes  can  communicate 
directly  if  the  center  frequency  of  one  is  admitted  by  the  other,  i.e.,  if  there  is  at  least  50% 
overlap  between  two  channels,  communication  is  possible.  We  consider  the  issue  of  capacity 
of  a  network  of  n  nodes,  deployed  uniformly  at  random,  where  each  node  has  an  untuned 
radio,  and  each  node  is  the  source  of  one  flow,  with  a  randomly  chosen  destination. 

Even  though  each  node  only  possesses  a  single  radio  and  stays  on  a  single  sub-band, 
due  to  the  partial  overlap  between  sub-bands,  it  is  still  possible  to  ensure  that  any  pair  of 
nodes  will  be  connected  via  some  path.  Contrast  this  to  the  case  of  orthogonal  channels, 
where  we  argued  in  Section  2.5  that  when  /  =  1,  and  c  >  1,  some  pairs  of  nodes  are 
disconnected  from  each  other  because  they  do  not  share  a  channel.  It  is  possible  to  map 
the  partial  overlap  feature  of  the  untuned  channel  case  to  adjacent  (2c -|- 2,  3)  and  (4c -|- 1,  2) 
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Figure  3.3;  Untuned  Radios:  Upper  Bound  via  virtual  (2c  +  2, 3)  channelization 

assignment,  and  obtain  upper  and  lower  bounds.  Note  that  /  >  2  allows  for  all  nodes  to 
be  connected,  even  with  orthogonal  channels. 

3.6.1  Upper  Bound  on  Capacity 

We  map  the  untuned  radio  scenario  to  a  scenario  involving  (2c  +  2,  3)  adjacent  channel 
assignment  (Fig.  3.3). 

We  perform  a  virtual  channelization  of  the  band  (^’1,^2)  into  2c  orthogonal  sub-bands. 
We  add  an  additional  (virtual)  sub-band  of  the  same  width  at  each  end  of  the  band,  to  get 
2c  -|-  2  orthogonal  channels,  numbered  1, ...,  2c  -|-  2.  Thus  1  and  2c  -|-  2  are  the  artificially 
added  channels.  If  a  radio’s  center  frequency  lies  within  virtual  channel  i,  it  is  associated 
with  virtual  channel  block  (i  —  l,i,i  +  1),  and  i  —  1  is  called  its  primary  virtual  channel. 
Thus  the  primary  channel  can  only  be  one  of  1,  2, ...,  2c  (since  the  center  frequency  can  only 
fall  in  2, ..,  2c  -|-  1).  If  a  node’s  primary  channel  is  i,  it  is  capable  of  communicating  with 
all  nodes  with  primary  virtual  channel  i  —  2  <  j  <i-|-2in  the  virtual  channelization.  In 
the  actual  situation,  the  node  with  the  untuned  radio  would  be  able  to  communicate  with 
some  subset  of  those  nodes.  Thus,  if  a  pair  of  nodes  cannot  communicate  directly  in  the 
virtual  channelization,  they  cannot  do  so  in  the  actual  situation  either,  and  disconnection 
events  in  the  former  are  preserved  in  the  latter.  The  probability  that  a  node  has  virtual 
channel  block  {j,j  +  l,j  -|-  2)  is  i.e.,  the  same  as  for  adjacent  (2c  -|-  2,3)  assignment, 
and  the  assignment  of  each  node  is  independent.  Therefore,  the  necessary  condition  for  the 
(virtual)  (2c -|-  2,  3)  assignment  continues  to  hold  for  the  corresponding  untuned  radio  case. 
This  yields  an  upper  bound  on  capacity  of  0{W y^niogn)' 

3.6.2  Lower  Bound  on  Capacity 

It  can  be  shown  that  a  schedule  constructed  for  an  adjacent  (4c  -|-  I,  2)  assignment  can  be 
used  almost  as-is  with  untuned  radios  (except  that  the  number  of  subslots  in  the  cell-slot 
must  increase  by  a  constant  factor  to  avoid  interference  due  to  overlap). 


36 


Figure  3.4:  Untuned  Radios:  Lower  Bound  via  virtual  (4c  +  1,2)  channelization 

We  perform  a  virtual  channelization  of  the  band  (^1,^2)  into  4c  +  1  orthogonal  sub¬ 
bands.  If  a  radio’s  center  frequency  lies  within  virtual  channel  i,  it  is  associated  with  virtual 
channel  block  (i,  i  +  1),  and  i  is  called  its  primary  virtual  channel. 

Thus,  if  a  pair  of  nodes  can  operate  on  a  common  channel  in  the  virtual  channelization, 
then  they  are  always  capable  of  direct  communication  in  the  actual  untuned  radio  situation. 
The  probability  that  a  radio  has  virtual  channel  block  (i,i  +  l)  is  which  is  the  same  as  for 
adjacent  (4c  +  1,  2)  assignment,  and  the  assignment  of  each  node  is  independent.  In  the  ad¬ 
jacent  (4c-|-l,  2)  assignment,  all  channel  are  orthogonal  and  can  operate  concurrently.  With 
untuned  radios,  we  assume  that  two  nodes  can  interfere  if  there  is  some  spectral  overlap. 
Thus,  a  transmission  by  a  node  on  center  frequency  F  can  interfere  with  transmissions  by 
nodes  with  center  frequency  in  the  range  (F  —  B,F  +  B).  Hence,  the  transmission  schedule 
for  untuned  radios  is  made  to  follow  the  additional  constraint  that  if  a  node  with  primary 
virtual  channel  i  is  active  then  no  node  with  primary  channel  i  —  5<j<i-|-5  should  be 
active  simultaneously.  This  can  decrease  capacity  by  a  factor  of  11,  but  would  not  affect 
the  order  of  the  asymptotic  results.  Also,  in  the  actual  network  involving  untuned  radios, 
a  transceiver  can  use  upto  B  =  spectral  bandwidth,  while  in  the  adjacent  (4c  -|-  1, 2) 

case,  it  would  be  ;  leading  to  the  possibility  of  having  a  higher  data-rate  in  the  former, 
given  the  same  transmission  power,  modulation,  etc.  However  this  can  only  affect  capacity 
by  a  small  constant  factor,  which  does  not  affect  the  order  of  the  results. 

In  the  adjacent  (4c  -|-  1,  2)  case,  our  construction  performs  transitions  to  ensure  that  a 
source  on  channels  (i,  i  -|-  1)  and  a  destination  on  channels  {i  +  j-,  i  +  j  +  1)  can  communicate 
over  j  <  4c  hops.  In  the  untuned  radio  case,  transitioning  occurs  through  nodes  that 
provide  the  required  virtual  channel  pair,  and  the  same  transition  strategy  as  for  (4c -|-  1, 2) 
assignment  continues  to  work.  Hence  the  capacity  is  H(IU y^niogn)  how. 

We  re-emphasize  that  even  though  /  =  1,  the  untuned  nature  of  the  radios  allows 
for  a  progressive  shift  in  the  frequency  over  which  the  packet  gets  transmitted,  thereby 
allowing  a  step-by-step  transition  from  the  source’s  center  frequency  to  a  frequency  admitted 


37 


by  the  destination.  The  adjacent  (c,  /)  model  captures  this  progressive  frequency-shift 
characteristic,  and  is  thus  able  to  model  the  untuned  radio  situation. 

The  upper  and  lower  bounds  proved  in  this  section  lead  to  the  following: 

Theorem  3.  In  the  regime  c  =  O(logn),  the  eapaeity  of  a  randomly  deployed  network  of 
untuned  radios  is  Q{W y^niogn)  P^''^  flow. 

3.7  Discussion 

The  capacity-achieving  construction  provides  some  useful  insights.  As  is  intuitive,  when  all 
nodes  cannot  switch  on  all  channels,  the  transmission  range  needs  to  be  larger  to  preserve 
network  connectivity.  This  leads  to  a  loss  of  capacity  compared  to  the  case  of  unconstrained 
switching.  Also,  it  may  no  longer  be  possible  to  use  the  straight-line  path  towards  the 
destination,  and  a  flow  may  need  to  traverse  a  larger  number  of  hops  {detour  routing)  in 
order  to  ensure  that  the  destination  is  reached.  However,  when  the  number  of  channels 
is  much  smaller  than  the  number  of  nodes,  the  increase  in  the  length  of  the  routes  is  not 
asymptotically  significant,  and  only  affects  the  capacity  by  a  constant  factor.  Taking  all 
factors  into  account,  given  situations  where  each  radio- interface  can  only  be  manufactured 
to  switch  on  /  channels  out  of  a  total  of  c  available  channels  (where  c  =  O(logn)),  it  is 
beneficial  in  the  asymptotic  regime  to  attempt  to  use  all  channels  by  assigning  different 
channel  subsets  to  different  nodes,  rather  than  follow  the  naive  approach  of  using  the  same 
/  channels  at  all  nodes.  In  the  latter  case,  the  per-flow  capacity  would  be  reduced  to 
cflnidgn^'  use- all- channels  approach  outperforms  the  f- common- channels 

approach  by  a  factor  of  Q{^J^).  For  instance,  even  when  /  =  2,  utilizing  all  channels  yields 
a  capacity  of  the  order  of  y/c  channels. 
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Chapter  4 


Random 


(c,  /)  Assignment 


In  this  chapter,  we  present  connectivity  and  capacity  results  for  the  random  (c,  /)  assign¬ 
ment  model  that  was  introduced  in  Chapter  2.  We  begin  by  defining  the  random  (c,  /) 
assignment  model  in  Section  4.1,  and  thereafter  summarize  the  chapter  results  in  Section 
4.2.  In  Section  4.3,  we  state  and  prove  some  preliminary  results  used  by  subsequent  proofs. 
We  present  necessary  and  sufficient  conditions  for  connectivity  in  Section  4.4.  Section  4.5 
presents  an  upper  bound  on  capacity.  In  Section  4.6  we  describe  a  sub-optimal  lower  bound 
construction  for  capacity.  The  optimal  lower  bound  construction  in  described  in  Section  4.7. 
Finally,  in  Section  4.8,  we  discuss  the  implications  of  the  capacity  result,  and  the  insights 
that  can  be  obtained  from  it. 


4.1  Model  Definition 


In  the  random  (c,  /)  assignment  model,  each  radio-interface  is  assigned  a  subset  of  /  chan¬ 
nels  from  a  total  of  c  available  channels  (2  <  /  <  c)  uniformly  at  random  from  all  such 
possible  subsets.  This  leads  to  the  followingd 


Pr[  a  given  interface  can  switch  on  channel  i\  =  p 


rnd 
's 


{i)  =  -  =  p 
C 


rnd 
's  5 


Vi 


(4.1) 


^The  number  of  ways  of  selecting  k  objects  from  a  set  of  m  objects,  i.e.,  is  usually  defined  as 
for  m  >  k  >  0.  For  k  >  m  >  0  or  k  >  0  >  m,  one  can  uniformly  define  (’(()  to  be  0,  as  there  exists  no 
way  of  selecting  k  objects  from  a  set  of  m  objects  under  these  circumstances.  In  this  chapter,  we  use  this 

k 

convention  for  notational  convenience.  It  is  also  to  be  noted  that  the  expression  ]”[  ( ^  yields 

i=l 

for  0  <  k  <  m  and  is  0  for  k  >  m  >  0. 
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Pr[  two  given  interfaces  can  switch  on  at  least  one  common  channel  ]  =  Pmd 


=  1  - 


=  1  - 


/ 


C-/  +  1 


(4.2) 


Evidently:  f  >c-  f  +  1  Pmd  =  1- 

Since  we  consider  only  single-interface  nodes  for  the  results  in  this  chapter,  there  is  a 
one-to-one  mapping  between  interfaces  and  nodes.  Thus,  as  also  in  Chapter  3,  we  often  use 
the  term  node  instead  of  interface  in  the  following  discussion. 


4.2  Summary  of  Results 

We  prove  the  following  results: 

1.  We  show  that  in  the  regime  c  =  O(logn),  the  critical  range  for  connectivity  with 
random  (c, /)  assignment  is 

2.  We  establish  the  per-flow  capacity  with  random  (c,  /)  assignment  for  the  regime  c  = 
O(logn)  as 

It  can  be  shown  that  Pmd  >  1  —  e  c  .  Hence,  the  implication  of  this  capacity  result  is 
that,  when  /  =  O(y^),  random  (c, /)  assignment  yields  capacity  of  the  same  order  as 
attainable  via  unconstrained  switching.  Thus,  for  the  random  (c,  /)  assignment  model,  y/c- 
switchability  is  sufficient  to  make  order-optimal  use  of  all  c  channels,  when  c  =  O(logn). 

A  preliminary  version  of  the  chapter  results  was  reported  in  [7,  6]. 

4.3  Preliminaries 

In  this  section,  we  state  and  prove  some  results  that  are  required  for  the  proofs  that  follow. 
Lemma  14.  For  c>2,  and  2  <  f  <  c: 

5?p:<min{y,2/}  (4.3) 

Proof  Since  Pmd  <  1,  h  follows  that: 
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^Prnd  ^  C 

~ -~f 


(4.4) 


If/> 


^Prnd 

~T~ 


<  'JYc  <  2/  ■:  Vrnd  <  1 


y—  2  f  2 

Now  consider  the  case  /  <  y  2?  c  ^  It  is  to  be  noted  that  /  < 
for  all  c  >  2.  We  take  note  of  the  following  inequality: 


(4.5) 


¥<1 


ln((l-^)/)^/ln(l-^) 


> 


= ln(l  — 


2/2 


c 


(since  In  x  is  an  increasing  function  of  x) 


(4.6) 


Noting  that  c  —  /  +  1  >  |  and  c  —  /  +  1  >  /  whenever  /  <  we  obtain  that: 


/ 

1  “  Prnd  ~  (  1  “ 

C 


1  - 


/ 


C  —  1 


1  - 


/ 


c  ~  /  +  1, 

>  (1 - >  (1  -  —y  >  1  -  —  using  (4.6)) 

c  -  /  +  1  c  c 

•  ■  Prnd  — 


(4.7) 


CPrnd 


<2/ 


From  (4.4),  (4.5)  and  (4.7): 


CPrnd  r  C 

^  <min{-,2/} 


□ 


Lemma  15.  min{|,2/}  <  \/^ 

Proof.  For  a  given  c,  we  have  2  <  f  <  c.  Thus,  given  c,  j  is  a  monotonically  decreasing 
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function  of  /,  while  2/  is  a  monotonically  increasing  function  of  /.  j  =  2f  =  at 
/  =  y|.  For  /  <  min{  j,  2/}  =  2/  <  \/^,  and  for  /  >  min{  j,  2/}  =  j  < 

Thus  min{j,2/}  <  \f2c.  □ 

Lemma  16.  The  following  inequality  holds  for  all  2  <  f  <  c: 

IKT)  -  TV) 


<  1  - 


Prnd 

40 


V  (3 

T-/) 

Proof.  We  begin  by  observing  that  =  1  —  Pmd- 

1/1 

Consider  the  following  three  cases: 

Case  1:  /  >c  —  /  +  1 

This  implies  that  Pmd  =  1-  Noting  that  the  L.H.S  cannot  exceed  =2(1  —  Pmd)  =  0,  the 
resnlt  follows  trivially. 

Case  2:  <  /  <  c  -  /  +  1 

This  implies  that  =  0.  Moreover; 


r/)  ^ 


n  1 


/ 


(3  ~i\y  ^-^+1 


<  1- 


/ 


C-/  +  2 


1  - 


/ 


C-/  +  1 


<  1- 


/ 


1  — ^  )  <  (  -  )  (  -  )  =  —  (recall  that  /  >  2) 

2/  +  iyv  2/y-V5yV2y  lo  ^  ^  ' 


Therefore,  L.H.S.  is  upper  bounded  by  2  (^)  — 0=^<1  —  ^<1  —  (since 
Prnd  —  1 )  • 

Note  that  when  2/  <  c  -  /  +  1;  ( ^  ^  =20  f  1  -  “  0  f  1  “ 


C— 2  +  1 

2  =  1  ^  ^  2=1 


C-2+1  /  • 


The  next  two  cases  pertain  to  this  regime. 

/ 

Case  3:  f  <  and  c-i+i  ^ 

2=1 

/  / 

Set  Xi  =  jz+[-  Note  that  1  —  Pmd  =  0(1  “  ^  Oe“**  =  <  0.45. 

2  =  1  2=1 

Therefore  pmd  >  0.55.  Hence: 


If  f 

2]J(1  -  Xi)  -  [](1  -  2xi)  <  2[](1  -  Xi)  <  0.9  <  1 
2=1  2=1  2=1 


2  2 

Prnd  ^  _  Prnd  ^  _  Prnd 


10 


10 
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/ 


Case  4:  f  <  ^^±1  and  ^  <  0. 


2=1 
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Denote  by  l2m+i<f  the  indicator  variable  which  is  1  if  2m  +  1  <  /  and  0  else.  We  first 
prove  the  following  inequality: 


/  / 

i=\  i=l 

(  'LixiY.xj)  T.{xiY.{xj  E  {xk))) 


2-2J:x.  +  2^^P - 2 


i  j^i  i  ji^i  k^i,j 


V 


3! 


+  ....  +  2(-1)-^XiX2... 


(  E((2x.)E(2x,)) 

1  -  '^{‘^Xi)  +  ^ - +  ...  +  (-1)-^(2xi)(2x2)...(2x/) 

V  ‘ 


E{XiJ2iXj))  EiXi'EiXj  E  Xk)) 

=  1  +  (2  -  2^)^ - - (2  -  2^)^ - - +  ....  +  (_i)/(2  _  2f)x,X2...Xf 


3! 


(2  -  22™+1) 


E  E 


Xio  . . . 


E 


(2m +1)1  ^  , 

*1  \  *27^*1  \  *2m+17^*l,*2,.-. 


^*2m+l  I  I  t2m+l</ 


,Xk 


i  \j^i  \k^i,j 


^E(-.E-.f- 

*1  \  *27^*1  \ 


E  ^*2™ 

*2m7^*l,*2,... 


(2  -  22”^+!) 
(2m  +  1)(2  -  22™) 


^  ^  Xi2m+l^‘- 


2m+l<f 


*2m+17^*l,*2,... 


*  \  j¥=i 


1  “  ^-^E  ®*E^t  “  0.8 ^xj  -  E  ^*E '  E 


*  \  jy* 


*  \  j¥=i 


l4j 


E 

m=2 


I  (22m  _  2)  I 

(2m)!  E  E  ^*2  -  2^  (1 

*1  \  *27^*1  \  *2m7^n,'i2,... 

(22m+l  _  2) 


(2m  +  l)(22"i  -  2)  ^  --*2111+1  2m+i</ 


^  y  Xi2m+1^2 
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<  1  —  0.2^^  j  xi^^Xj  j  whenever  <  0. 

*  \ 

^22m+l  _  2) 


.•  1  - 


(2m  +  1)(22™  -  2) 


X]  Xi^^^j2m+i<f  >  0  Vm  >  2  when  '^Xi  <  0. 


*2m+l^*l,*2v 


/ 


Set  Xi  =  jrl+x-  By  assumption  ^Xi  =  Y1  c-f+i  ^  0-S-  Also  ^{xiY^Xj)  >  /(/  -  1)^  > 

i  i=l  i  j^i 


(applying  Lemma  14).  Hence  2n(l  —  x*)  —  0(1“  2xj)  < 


l-0.2Ex*E^i  <  !-%• 

*  jV* 


/ 

0 

2  =  1 


/ 

0 

2=1 


□ 


4.4  Conditions  for  Connectivity 


We  now  show  that  the  critical  range  for  connectivity  with  random  (c,  /)  assignment  in  the 
regime  c  =  O(logn)  is 


4.4.1  Necessary  Condition  for  Connectivity 

Theorem  4.  With  a  random  {c,  f)  channel  assignment  (when  c  =  0{logn)),  if  TTr‘^{n)  = 
(iogn+b(n))^  ^  ^  =  !_(!_  Z)(1  _  and  c  =  O(logn),  and 

limsup  h{n)  =  b  <  +oo  then: 


liminf  Pr[  disconnection  ]  >  e  ^(1  —  e 

n— >oo 

where  by  disconnection  we  imply  the  event  that  there  is  a  partition  of  the  network. 

Proof  The  proof  is  obtained  by  an  adaptation  of  the  proof  technique  used  in  [42],  We 

provide  a  proof-sketch  here.  The  detailed  proof  is  described  in  Appendix  B. 

We  focus  on  the  disconnection  events  where  some  node(s)  are  isolated. 

From  the  model  definition,  the  probability  that  two  nodes  in  range  of  each  other  can 

operate  on  at  least  one  common  channel  is  p  =  pmd  where  1  —pmd  =  (1“|)(1“  ^^)---(l  “ 

/  ) 
c-f+lh 

The  probability  that  a  node  x  is  isolated,  i.e.,  cannot  communicate  with  any  other  node, 
is  give  by  Pi  =  (1— p7rr^(n))^"'“^l  One  can  also  obtain  an  upper  bound  p2  on  the  probability 
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that  two  nodes  x  and  y  are  both  isolated.  It  can  be  shown  that: 

Pr[  disconnection  ]  >  Pr[x  is  only  isolated  node] 

X 

>  Pr[x  isolated  ]  —  Pr[x  and  y  both  isolated  ] 

X  x.y^x 

(4.9) 

>  npi  —  n{n  —  l)p2 

>  6e~^  —  (1  +  e)e“^^  where  h  =  limsup  h{n) 

n— >oo 

for  any  0  <  1,  e  >  0,  and  large  n 

Therefore,  if  limsup  b{n)  =  b  <  +oo,  the  network  is  asymptotically  disconnected  with 

n— >00 

some  positive  probability.  □ 

Corollary  2.  With  a  random  (c, /)  assignment,  the  neeessary  eondition  for  connectivity  is 
that  r(n)  =  ),  else  the  network  is  disconnected  with  some  positive  probability. 

y  Prnd^ 

4.4.2  Sufficient  Condition  for  Connectivity 

Theorem  5.  With  random  {c,  f)  assignment,  in  the  regime  c  =  O(logn),  if  Trr‘^{n)  = 
then: 

Prnd^ 

lim  Pr[  network  is  connected  ]  =  1 

n— >oo 

Proof  The  construction  is  based  on  per-node  structures  termed  as  backbones. 

Consider  a  subdivision  of  the  unit  torus  into  square  cells  of  area  a(n)  =  Noting 

that  prnd  >  ^  and  setting  a(n)  =  in  Lemma  59,  there  are  at  least  ^ 

nodes  in  each  cell  with  probability  at  least  1  —  ^ .  Choose  r{n)  =  y^8a{n).  Resultantly, 

a  node  in  any  given  cell  has  all  nodes  in  adjacent  cells  within  its  range. 

Within  each  cell,  we  categorize  nodes  as  either  transition  facilitators  or  backbone  can¬ 
didates  (the  meaning  of  these  terms  shall  become  clear  later)  in  the  following  manner:  We 
choose  I  ^ J  nodes  uniformly  at  random,  and  set  them  apart  as  transition  facilitators. 
This  leaves  at  least  nodes  in  each  cell  that  can  be  deemed  as  backbone  candidates. 

Consider  any  node  in  any  given  cell.  The  probability  that  it  can  communicate  with  any 
other  random  node  in  its  range  is  Pmd-  Hence,  the  probability  that  in  an  adjacent  cell,  there 
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r  48  log  n  1 

is  no  backbone  candidate  node  with  which  it  can  communicate  is  at  most  {l—pmd)  < 

e48Lgn  =  ^  (applying  Fact  2). 

The  probability  that  a  given  node  cannot  communicate  with  any  node  in  some  adjacent 
cell  is  thus  at  most  ^  (applying  the  union  bound  over  all  8  possible  adjacent  cells).  By 
applying  the  union  bound  over  all  n  nodes,  the  probability  that  at  least  one  node  is  unable 
to  communicate  with  any  backbone  candidate  node  in  at  least  one  of  its  adjacent  cells  is  at 
most  -ly. 

We  associate  with  each  node  x  a  set  of  nodes  and  links  13{x)  called  the  backbone  for  x. 
B{x)  is  constructed  as  follows: 

Throughout  the  procedure,  cells  that  are  already  covered  by  the  under-construction 
backbone  are  referred  to  as  filled  cells,  x  is  by  default  a  member  of  B{x),  and  its  cell  is  the 
first  filled  cell.  From  each  adjacent  cell,  amongst  all  backbone  candidate  nodes  sharing  at 
least  one  common  channel  with  x,  one  node  (and  hence  also  the  link  between  that  node  and 
x)  is  chosen  uniformly  at  random  and  added  to  B{x).  Thereafter,  from  each  unfilled  cell 
bordering  a  filled  cell,  of  all  nodes  sharing  at  least  one  common  channel  (and  hence  a  feasible 
link)  with  some  node  already  in  B{x),  one  is  chosen  uniformly  at  random,  and  is  added  to 
B{x)  (the  link  on  the  basis  of  which  this  node  was  chosen  is  added  as  a  backbone  link);  the 
cell  containing  the  chosen  node  gets  added  to  the  set  of  filled  cells.  This  process  continues 
iteratively,  till  there  is  one  node  from  every  cell  in  B{x).  From  our  earlier  observations,  for 
all  nodes  x,  B{x)  will  eventually  cover  all  cells  with  probability  at  least  1  —  Note  that 
from  any  node  in  B{x)  there  is  a  path  to  x  comprising  entirely  of  links  in  the  backbone. 

Now  consider  any  pair  of  nodes  x  and  y.  If  there  exists  a  connected  path  between  some 
node  in  B{x)  and  some  node  in  B{y)  then  x  and  y  are  connected.  This  can  occur  in  many 
different  ways.  Consider  three  possibilities  (Fig.  4.1.) 

If  B{x)  and  B{y)  have  a  common  node  (Fig.  4.1(a)),  then  the  two  nodes  are  obviously 
connected,  as  one  can  proceed  from  x  on  B{x)  towards  one  of  the  common  nodes,  and 
thence  to  y  on  B{y),  and  vice-versa. 

Suppose  the  two  backbones  are  disjoint.  Then  x  and  y  are  still  connected  if  there  is  some 
cell  such  that  the  node  belonging  to  B{x)  in  that  cell  (let  us  call  it  Qx)  can  communicate 
with  the  node  belonging  to  B{y)  in  that  cell  (let  us  call  it  qy),  either  directly,  or  through 
a  third  node,  qx  and  qy  can  always  communicate  directly  if  they  share  a  common  channel 
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Figure  4.1:  Some  ways  in  which  backbones  can  be  connected 


(Fig.  4.1(b)).  Hence,  the  case  of  interest  is  one  in  which  no  cell  has  qx  and  q-y  sharing  a 
channel. 

Consider  a  particular  cell,  with  qx  and  qy  as  the  respective  backbone  members.  If  qx 
and  qy  do  not  share  a  common  channel,  we  consider  the  event  that  there  exists  a  third  node 
amongst  the  transition  facilitators  in  the  cell  through  whom  they  can  communicate  (Fig. 
4.1(c)).  Given  backbones  13{x)  and  B{y),  and  given  a  network  cell  in  which  qx  and  qy  do 
not  share  a  channel,  the  probability  that  they  can  both  communicate  with  a  given  third 
node  z  that  did  not  participate  in  backbone  formation  and  is  known  to  lie  in  the  same  cell, 
is  independent  of  the  probability  of  a  similar  event  in  another  cell. 

Therefore,  the  overall  probability  can  be  lower-bounded  by  obtaining  for  one  cell  the 
probability  of  qx  and  qy  communicating  via  a  third  node  2:  in  the  cell  given  they  have  no 
common  channel,  taking  into  account  that  each  cell  has  at  least  possibilities  for  2, 

and  treating  it  as  independent  across  cells.  We  elaborate  on  this  further: 

Let  qx  have  the  set  of  channels  C{qx)  =  {c^i,  ...,Cxf},  and  qy  have  the  set  of  channels 
C{qy)  =  {cy^,...,Cyj,},  such  that  C{qx)  n  C{qy)  =  cf. 

Consider  a  third  node  2  amongst  the  transition  facilitators  in  the  same  cell  as  qx  and  qy. 
Denote  the  set  of  z’s  chanels  by  C{z).  We  desire  that  C{z)t~\C{qx)  /  (f  and  C{z)r\C{qy)  7^ 

Note  that  a  node  x  is  a  member  of  its  own  backbone.  Thus  qx  =  x  in  x’s  cell,  and 
if  X  is  a  transition  facilitator,  this  would  imply  that  qx  =  x  is  not  a  backbone  candidate. 
To  maintain  uniformity  and  clarity,  let  us  therefore  only  consider  cells  other  than  those 
in  which  x  and  y  lie  (this  can  lead  to  the  exclusion  of  at  most  2  cells).  In  any  such  cell, 
qx  and  qy  are  both  backbone  candidates,  and  if  they  do  not  share  a  common  channel,  it 
implies  that  they  can  communicate  through  a  given  transition  facilitator  z  with  probability 
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Pz  =  1  - 


^  ^  (°)  ^  /  ~  (Lemma  16). 


There  are  possibilities  for  z  within  that  cell  if  neither  x  and  y  lie  in  the  cell  (since 

in  that  case  qx-,qy  are  both  backbone  candidates),  and  all  the  possible  z  nodes  have  i.i.d. 
channel  assignments.  Thus,  the  probability  that  qx  and  qy  cannot  communicate  through 
any  z  in  the  cell  is  at  most  (1  —  Pz)^  ,  and  the  probability  they  can  indeed  do  so  is 

I  2  log  n  I 

Pxy  >  1  —  (1  —  Pz)  • 

The  number  of  such  cells  is  at  least  —  2  =  —  2.  Therefore,  the  probability 


a{n) 


Prnd]! 


that  this  happens  in  none  of  the  —  2  cells  is  at  most  (1  —  Pxy)^°°'°^"  ^  <  (1  — 
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(recall  that  c  =  O(logn) 


and  therefore  pmd  =  ^(ki|^))  and  of  course  pmd  <  !)• 

Applying  the  union  bound  over  all  (2)  <  ^  node  pairs,  the  probability  that  some  pair 


of  nodes  are  not  connected  is  at  most 


-r2(- 


,  1  — 0(- — g — )+21ogn  A  1  .  ,1 

<  ig  iogz„  _  Applying  another 


2  —  2 

union  bound  over  this  probability,  the  probability  that  some  of  the  cells  are  not  sufficiently 
populated  (as  mentioned  earlier,  this  probability  is  at  most  and  the  probability 

that  some  backbone  cannot  be  grown  fully  (at  most  :^),  we  obtain  that  the  probability  of 
a  connected  network  converges  to  1,  as  n  ^  00.  □ 


4.5  Upper  Bound  on  Capacity 

Theorem  4  established  that  unless  r(n)  =  0(a/  ),  some  node  is  isolated  with  positive 

y  Prnd^ 

probability.  In  Section  2.5  of  Chapter  2,  we  discussed  how  the  need  to  have  r(n)  =  Q{g{n)) 
implies  that  capacity  is  constrained  to  be  0(^^^ ).  In  light  of  this,  it  follows  that,  for  the 
random  (c,  /)  model  in  the  regime  c  =  O(logn),  the  per  flow  capacity  is  0(IT • 

4.6  A  Sub-Optimal  Lower  Bound  on  Capacity 

We  describe  a  construction  CRi  that  achieves  a  per-flow  throughput  of 
Though  it  is  not  optimal,  this  construction  is  of  interest  for  the  following  reasons: 

•  The  optimal  procedure  uses  this  construction  for  /  <  100. 

•  This  construction  involves  a  simple  routing  and  scheduling  procedure,  in  contrast  to 
the  optimal  procedure  for  /  >  100  described  in  Section  4.7.  Thus,  it  exemplifies  a 
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performance-complexity  trade-off. 

This  construction  is  quite  similar  to  the  construction  for  adjacent  (c, /)  assignment. 
The  surface  of  the  unit  torus  is  divided  into  square  cells  of  appropriate  area  a{n)  each. 
The  transmission  range  is  set  to  y^8a{n),  thereby  ensuring  that  any  node  in  a  given  cell 
is  within  range  of  any  other  node  in  any  adjoining  cell.  The  number  of  cells  that  interfere 
with  a  given  cell  is  only  some  constant  (say  7).  We  choose  a(n)  = 

Lemma  17.  There  are  at  least  and  at  most  nodes  in  every  cell  w.h.p. 

Proof.  By  application  of  Lemma  59,  we  can  show  that  the  number  of  nodes  in  any  cell  lies 
between  and  with  probability  at  least  1  -  5^.  □ 

Lemma  18.  If  there  are  at  least  nodes  in  every  cell,  then  with  probability  at  least 

1  —  O(^),  for  each  of  the  c  ehannels,  there  are  at  least  25  log n  nodes  in  each  cell  that  ean 
switch  on  that  channel. 

Proof.  Let  us  consider  one  particular  cell  Tt.  Let  Xij  =  1  if  node  j  can  switch  on  channel 

i,  and  0  else.  Px[Xij  =  1]  =  ^,  and,  for  a  given  i,  all  the  W^’s  are  independent.  Let 

Xi  =  ^  Xij.Th.en  E\Xi\  >  50 log n.  By  application  of  the  Chernoff  bound  in  Lemma  53 
j&H 

(with  /3  =  2)  ,  we  obtain: 

Pr[Xj  <  25  log  n]  <  exp(— ^  (4.10) 

Since  there  are  c  =  0(log  n)  channels,  the  union  bound  yields  that  Pr[Xj  <  25  log  n  for  any  i  G 
l,2,...,c]  <  iiF  =  O(^)  ^  0{^).  Further,  since  there  are  ^  =  xoofer  <  ”  cells, 
another  application  of  the  union  bound  yields; 

Pr[  less  than  25 log n  nodes  per  channel  in  any  cell]  =  O(^)  (4-11) 

□ 

4.6.1  Routing 

Initially,  each  flow  is  assigned  a  source  channel  /,  as  well  as  a  target  destination  channel 
r.  The  source  channel  for  a  flow  originating  at  node  S  is  chosen  according  to  the  uniform 
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distribution  from  the  /  channels  available  at  S.  The  destination  channel  may  be  chosen 
from  amongst  the  /  channels  available  at  destination  D  in  any  manner. 

We  need  to  find  a  feasible  path  from  S  to  D.  To  obtain  a  feasible  path,  we  try  to  find 
a  sequence  of  nodes  uq  =  5,  ui,  U2, Vi, =  D  such  that,  for  all  0  <  m  <  i,  Vm  can 
operate  on  channel  I,  and  for  all  i  <  m  <  A:,  Vm  can  operate  on  r.  Thus,  node  Vi  on  the  route 
is  capable  of  switching  (operating)  on  both  I  and  r,  and  this  node  serves  as  a  transition 
point  for  the  flow’s  route.  To  be  able  to  find  such  a  node  Vi,  we  may  need  to  inspect  a 
certain  minimum  number  of  cells. 

Lemma  19.  If  each  flow  traverses  and  inspects  h  >  \ 1  distinct  cells,  where  the  cells 
to  be  inspected  are  chosen  in  a  manner  independent  of  channel  presence  in  that  cell  (the 
cells  inspected  by  any  single  flow  should  be  distinct;  two  flows  may  traverse  the  same  cell), 
a  transition-point  (relay  node)  that  can  switch  on  both  the  flow’s  source  channel,  and  the 
flow’s  destination  channel  will  be  found  by  each  flow  w.h.p. 

Proof  Consider  a  particular  flow.  From  Lemma  17,  each  cell  has  at  least  nodes 

w.h.p.  The  probability  that  there  is  no  node  capable  of  operating  on  both  channels  i  and  j 
in  a  given  cell  along  the  flow’s  route  is  at  most  (1  —  c(c-i)  )  ^  (since  nodes  are  assigned 

channels  in  an  i.i.d.  manner).  Thus  the  probability  of  not  finding  such  a  node  after  h  hops 
is  at  most  (1  —  ^  .  If  h  >  r25(/-i)~l  ’  then  after  traversing  h  distinct  cells,  the 

probability  of  not  finding  such  a  node  is  at  most  (1  — <  exp(— 41ogn)  < 

Applying  the  union  bound  over  all  n  flows,  the  probability  that  this  should  happen 
for  even  one  flow  is  at  most  Hence,  all  flows  will  have  be  able  to  make  the  required 
transition  w.h.p.,  after  traversing  h  >  \ 25^(/ J^i)  1  distinct  hops.  □ 

Note  that  25{f-l)  —  Thus,  if  we  ensure  that  each  flow’s  route  passes  through  at 
least  [ 7^~\  intermediate  cells,  we  will  be  able  to  find  an  end-to-end  feasible  route  for  each 
flow  w.h.p.  Therefore,  we  adopt  the  following  routing  strategy: 

The  (almost)  straight-line  SD' D  path  is  followed  if  either  source  and  destination  chan¬ 
nels  are  the  same,  or  if  the  straight-line  segment  SD'  provides  h  >  intermediate 

hops.  If  S  and  D'  (hence  also  D)  lie  close  to  each  other,  the  hop-length  of  the  straight  line 
cell-to-cell  path  can  be  much  smaller.  In  this  case,  a  detour  path  is  chosen,  in  a  manner 
similar  to  that  described  in  Chapter  3  for  adjacent  (c,  /)  assignment,  and  depicted  in  Fig. 
4.4,  by  considering  a  circle  of  radius  [y]r(n)  centered  at  S,  selecting  a  point  P  on  the 


50 


circumference  of  that  circle,  and  routing  the  flow  along  the  sequence  of  cells  traversed  by 
SP,PD',  and  then  a  possible  additional  last  hop. 

Similar  to  the  construction  for  adjacent  (c,  /)  assignment  described  in  Chapter  3,  we 
associate  two  phases  with  a  flow’s  route:  a  progress-on-source-channel  phase,  and  a  ready- 
for-transition  phase.  We  stipulate  that  a  non-detour-routed  flow  stays  in  the  progress-on- 
source-channel  phase  along  the  route,  till  there  are  only  intermediate  hops  left  to  the 

destination.  At  this  point,  it  enters  a  ready-for-transition  phase,  and  is  prepared  to  make 
a  transition  given  an  appropriate  relay  node  that  provides  the  requisite  channel-pair  for 
transition  (the  relay  selection  strategy  is  described  later).  A  detour- routed  flow  is  always 
in  ready-for-transition  phase. 

The  need  to  perform  detour  routing  for  some  source-destination  pairs  does  not  have  any 
substantial  effect  on  the  number  of  flow-routes  that  traverse  a  cell. 

Lemma  20.  The  number  of  straight-line  SD'D  flow-routes  that  traverse  any  cell  is  0{n-\/a  (n)). 

Proof.  From  Lemma  61,  the  number  of  SD'  straight-lines  traversing  a  single  cell  are  0{n-\/ a{n))^ 
yielding  0{ny^ a{n))  flow-routes. 

We  must  now  separately  consider  the  number  of  routes  whose  last  D'D  hop  may  enter 
this  cell.  If  D  is  in  the  same  cell  as  D' ,  there  is  no  extra  hop.  Otherwise,  the  number  of 
flows  for  which  D'  lies  in  one  of  the  8  adjacent  cells  is  0{na{n))  w.h.p.  (since  it  follows 
from  Lemma  59  (applied  to  the  set  of  n  pseudo-destinations)  that  the  number  of  pseudo¬ 
destinations  in  any  cell  is  0{na{n))).  Since  na{n)  =  0{n-\/ a{n)),  the  total  number  of 
traversing  flow-routes  is  0{ny/ a{n)).  □ 

Lemma  21.  If  the  number  of  flow-routes  traversing  any  cell  is  x  with  only  straight-line 
routing,  it  is  x  -\-  0{n  r{nfl)  x  -\-  O(log^n)  with  detour  routing. 

Proof.  The  detour  occurs  only  when  the  straight-line  route  has  less  than  [ 7^~\  intermediate 
hops,  and  the  new  route  lies  entirely  within  a  circle  of  radius  around  the  source. 

Thus,  the  extra  flows  that  may  pass  through  a  cell  (compared  to  straight-line  routing) 
are  only  those  whose  sources  lie  within  a  distance  \^^~\r{n)  from  some  point  in  this  cell. 

All  such  possible  sources  fall  within  a  circle  of  radius  (1  -|-  \ and  hence  area 
afln)  =  TT  r‘^(n).  Noting  that  the  source  locations  are  i.i.d.,  and  applying 

Lemma  60,  any  circle  of  this  area  has  0{nac{n))  nodes,  and  hence  0{nac{n))  sources  w.h.p. 
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Thus,  the  number  of  extra  flows  that  traverse  any  cell  due  to  detour  routing  is  0{nac{n)). 
Each  such  flow  may  traverse  a  cell  at  most  twice  along  the  SPD'  segment,  and  possibly 
once  more  in  the  additional  last  hop.  Therefore,  the  total  number  of  flow-routes  is  x  -|- 
0(n  (j)  r^(n))  x -|- 0(log"^  n)  w.h.p.  □ 

Lemma  22.  The  number  of  flow-routes  traversing  any  cell  is  a{n))  even  with  detour 

routing. 

Proof.  This  follows  from  Lemma  20,  Lemma  21  and  the  observation  that  0(log^  n) 
0{n^{  n))  for  our  choice  of  a{n).  □ 

Lemma  23.  The  number  of  flow-routes  traversing  any  cell  in  ready-for-transition  phase  is 
O(log^n)  w.h.p. 

Proof.  We  first  account  for  the  flows  traversing  the  cell  along  the  SD'  segment,  and  later 
account  for  the  possible  additional  D'D  hop. 

By  our  construction,  a  non-detour- routed  flow  enters  the  ready-for-transition  phase  only 
when  it  is  ©(j)  hops  away  from  its  destination.  All  such  flows  must  have  their  pseudo¬ 
destinations  within  a  circle  of  radius  0(  jr(n))  centered  in  the  cell.  The  number  of  pseudo¬ 
destinations  that  lie  within  any  circle  of  radius  0(jr(n))  from  the  cell  is  0(nj^r^(n))  = 
logn)  O(log^n)  w.h.p.  (by  suitable  choice  of  a{n)  in  Lemma  60,  and  by 

observing  that  c  =  O(logn)). 

A  detour-routed  flow  is  always  in  ready-for-transition  phase.  From  Lemma  21,  there 
are  at  most  O(log^n)  such  flows,  and  they  can  traverse  a  cell  at  most  twice  along  the  SD' 
(more  precisely  SPD')  segment,  yielding  O(log^n)  distinct  flow-routes. 

We  now  account  for  the  fact  that  all  the  above  routed  flows  could  have  an  additional 
last  D'D  hop  that  may  need  to  be  counted  separately.  As  argued  in  the  proof  of  Lemma 
20,  these  yield  0{na{n))  =  0(‘^*°^”')  O(log^n)  additional  traversals. 

Hence  the  number  of  flow-routes  traversing  any  cell  in  ready-for-transition  phase  (count¬ 
ing  repeat  traversals  separately)  is  0(log^  n)  w.h.p.  □ 

Relay  Node  Selection  In  the  progress-on-source-channel  phase,  the  flow’s  packets  are 
transmitted  on  the  source  channel.  During  this  phase,  the  next  hop  node  is  chosen  to  be 
the  node  in  the  next  cell  which  has  the  smallest  number  of  flow-links  assigned  so  far  for 
relaying  on  that  channel,  amongst  all  nodes  that  can  switch  on  the  source  channel. 
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In  the  ready-for-transition  phase,  the  goal  is  to  seek  a  relay  node  that  can  operate  on 
both  the  source  channel  and  the  destination  channel,  and  therefore  is  capable  of  serving 
as  the  transition  point.  It  makes  use  of  the  first  opportunity  that  presents  itself,  i.e.,  if  a 
node  in  an  on-route  cell  provides  the  source-destination  channel  pair,  the  flow  is  assigned 
to  that  node  for  relaying  (the  flow’s  packets  are  receievd  by  the  node  on  the  source  channel, 
and  transmitted  to  the  next  hop  node  on  the  destination  channel).  Once  it  has  made 
the  transition  into  the  destination  channel,  it  remains  on  that  channel.  In  the  ready-for- 
transition  phase,  it  may  be  assigned  to  any  eligible  node  that  provides  either  the  transition 
opportunity,  or  the  source  channel  (for  flows  yet  to  find  a  transition),  or  the  destination 
channel  (for  flows  that  have  already  transitioned  into  their  destination  channel). 


4.6.2  Load  Balance  within  a  Cell 

A  flow-link  is  said  to  enter  a  cell  on  a  channel  j  if  the  flow’s  route  includes  a  hop  (link) 
(uj_i,Ui),  where  Uj_i  is  in  a  cell  adjacent  to  Ti,  Vi  is  in  ,  and  Vi-i  transmits  the  flow’s 
packets  to  Vi  using  channel  j  (this  naturally  implies  that  both  Vi-i  and  Vi  can  operate  on 
channel  j).  Similarly,  a  flow- link  is  said  to  leave  a  cell  7i  on  channel  j  if  the  route  includes 
a  link  (uj,  Uj+i),  where  vi  is  in  Ti,  Ui+i  is  in  a  cell  adjacent  to  Ti,  and  Vi  transmits  the  flow’s 
packets  to  Uj+i  using  channel  j. 


Per-Channel  Load  Recall  that  each  cell  has  0{na{n))  nodes  w.h.p.,  and  0{n-\/ a{n)) 
flows  traversing  it  w.h.p. 


Lemma  24.  The  number  of  flow-links  that  enter  any  cell  on  any  single  ehannel  is  0{ 

w.h.p. 


) 


Proof.  Consider  a  particular  cell  7i.  A  flow-link  may  enter  the  cell  on  channel  i  if: 


1.  The  flow’s  source  channel  is  i  and  it  is  in  progress-on-souree-channel  phase 

2.  The  flow  is  in  ready-for-transition  phase,  its  source  channel  is  i,  but  is  yet  to  find  a 
transition  into  the  destination  channel 

3.  The  flow  is  in  ready-for-transition  phase,  its  destination  channel  is  f ,  and  it  has  already 
made  a  transition 
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Recall  that  the  sequence  of  cells  traversed  by  a  flow  was  chosen  in  a  manner  that  did 
not  depend  on  the  channels  the  source  node  can  switch  on.  Since  each  node’s  interface  is 
assigned  a  random  subset  of  /  channels,  and  it  further  makes  an  i.i.d.  choice  of  a  source 
channel  from  amongst  these,  it  follows  that  a  flow’s  source  channel  can  be  any  of  1,  2, ...,  c 
with  equal  probability.  Furthermore,  the  source  channels  for  different  flows  are  independent. 
However,  the  destination  channels  of  flows  are  not  necessarily  independent,  since  two  flows 
with  the  same  destination  are  more  likely  to  have  the  same  destination  channel. 

Thus,  if  a  flow-link  enters  the  cell  in  progress-on-source-channel  phase  (also  referred  to 
as  a  non-transitioning  flow- link),  it  is  equally  likely  to  be  on  any  channel; 

Pr[  flow- link  is  on  channel  z]  =  -,  Vl<i<c 


Denote  the  number  of  flow-links  entering  the  cell  vaprogress- on- source- channel  phase  by 
m.  From  Lemma  20  and  Lemma  22,  it  follows  that  m  =  0{n^^ a{n)). 

Let  Xij  be  an  indicator  variable  which  is  1  if  flow- link  j  enters  the  cell  on  channel  i, 
and  is  0  else. 

Then  Xi  =  Xij  denotes  the  number  of  flow-links  in  progress-on-source-channel  phase 
that  enter  the  cell  on  channel  i,  and  E[Xi]  =  The  Xjj’s  are  i.i.d.  random  variables  for 
a  given  i,  as  each  flow’s  source  channel  is  chosen  in  an  i.i.d.  manner  (though  they  may  not 
be  independent  for  different  i,  since  Xij  =  1  X^j  =  0  Mk  ^  i).  Hence,  we  may  set 

{1-^  l5)E[Xi]  =  max{  ,  3  log  n} (note  that  /3  >  —  1  >  0)  apply  the  Chernoff  bound  from 
Lemma  51,  and  obtain  that: 


e^m 


^^[Xi  >  max{ - ,31ogn}] 


< 


< 


< 


(1  + 

e 


Y 


E[Xi_ 


{l+0)E[Xi\ 


,  {l+0)E[Xi] 

{1  +  (3)) 
eE[Xi\ 

~  ymax{^,31ogn}^ 
<  exp(-(l  +  l3)E[Xi]) 


<  exp(—  max{^— 3  logn}) 
c 


(4.12) 
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Taking  the  union  bound  over  all  c  channels,  the  probability  that  any  channel  has  more  than 
2  2 

max{  ,  3  log  n}  flows  is  at  most  c  exp ( —  max{  ,  3  log  n} ) .  Taking  another  union  bound 
^  =  loofrogn  cells,  this  probability  is  at  most  lod^iogn  exp(-max{g^,31ogra})  = 

Since  max{ ,  3  log  n}  =  (note  that  m  is  0{n^^a{n))  and  log  n  is 

we  have  proved  that  the  number  of  flow- links  that  enter  any  cell  in  progress-on-source- 
channel  phase  on  any  single  channel  is 

We  now  account  for  the  flow-links  that  enter  a  cell  in  their  ready-for-transition  phase. 
From  Lemma  23  there  are  0(log^  n)  flow-routes  traversing  any  cell  in  this  phase  w.h.p. 
(counting  repeat  traversals  separately).  Thus,  the  additional  overhead  posed  by  the  corre¬ 
sponding  flow-links  on  any  channel  is  0(log^  n)  w.h.p. 

Hence,  the  per-channel  load  in  each  cell  is  at  most  0(^^^^^^P^)-|-0(log^  n)  C>(’^^°(”')-) 
w.h.p.  □ 

Lemma  25.  The  number  of  flow-links  that  leave  any  cell  on  any  single  channel  is  ^ 

w.h.p. 

The  proof  follows  by  taking  note  of  Lemma  24,  and  then  applying  the  same  argument 
as  that  for  Lemma  12. 

Per-Node  Load 

Lemma  26.  The  number  of  flow-links  that  are  assigned  to  any  one  node  in  any  cell  is 
C>(— — j  w.h.p. 

Proof.  A  node  is  always  assigned  an  outgoing  link  for  the  single  flow  for  which  it  is  the 
source.  A  node  is  also  assigned  an  incoming  flow-link  for  flows  for  which  it  is  the  destination 
(these  flows  terminate  in  that  cell),  and  there  are  O(logn)  such  flows  for  any  node  w.h.p. 
(Lemma  1). 

In  addition,  a  node  may  be  assigned  flow-links  as  a  relay  on  the  routes  of  other  flows 
(for  each  such  route,  it  is  assigned  an  incoming  link  as  well  as  an  outgoing  link). 

Some  of  these  flows  may  be  in  the  ready-for-transition  phase;  for  these  flows  it  may 
provide  the  required  channel  pair  to  facilitate  a  transition,  or  provide  the  source  channel 
(flows  yet  to  find  a  transition)  or  destination  channel  (flows  that  have  already  transitioned) . 
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From  Lemma  23,  there  are  O(log^n)  such  flow-routes  traversing  the  cell  w.h.p.  Thus,  a 
node  or  channel  can  only  have  0(log^  n)  such  flow-links  assigned  for  relaying. 

It  may  also  be  assigned  as  a  relay  on  the  routes  of  flows  that  are  in  progress-on-source- 
channel  phase,  and  do  not  originate  in  the  cell.  We  have  already  established  in  Lemma  24, 
that  the  number  of  flow- links  that  enter  on  a  given  channel  in  any  cell  is 
By  construction,  we  have  chosen  cell  sizes  such  that  there  are  at  least  25  log  n  nodes  on  each 
channel  in  each  cell  w.h.p.  (Lemma  18).  Also  c  =  O(logn).  A  flow-link  in  progress-on- 
source- channel  phase  is  always  assigned  to  the  node  with  least  load  on  that  channel  so  far 
(from  amongst  all  nodes  in  that  cell  capable  of  switching  on  that  channel).  From  Lemma 
24,  and  the  fact  that  each  node  can  switch  on  only  /  channels,  the  number  of  such  flows 

/ n\/ a(n) 


that  are  assigned  to  any  one  node  is  Q('^ciogr^) 


0(- 


w.h.p. 


The  resultant  number  of  assigned  flow-links  per  node  is  1  -|-  O(logn)  -|-  O(log^n)  -|- 


o( 


2^a(n)^ 


o( 


T-\/a(n)^ 


□ 


4.6.3  Transmission  Schednle 


The  transmission  schedule  is  obtained  in  a  manner  similar  to  the  procedure  in  Section  3.5.3 
of  Chapter  3,  by  first  obtaining  a  global  inter-cell  schedule  (recall  that  the  cell-interference 
graph  has  chromatic  number  at  most  1  -|-  7,  where  7  is  a  constant  independent  of  n),  and 
then  constructing  a  conflict  graph  for  intra-cell  scheduling.  From  Lemmas  25  and  26,  the 
degree  of  the  conflict  graph  is  0("^^°^"-)  =  0("'^°(’^-).  It  is  well-known  that  a 

graph  with  maximum  node  degree  d  has  chromatic  number  at  most  d-\-l,  and  so  the  graph 
can  be  colored  using  colors. 

I— — -  /  cn  log  n~ 

Thus,  the  cell-slot  is  divided  into  ^  )  =  0{^ — — )  equal  length  subslots,  and 

each  outgoing  flow-link  gets  assigned  a  slot  for  transmission  on  its  assigned  channel  at 
the  per-channel  rate  of  ^  (the  slot-assignment  is  obtained  via  the  conflict-graph  coloring 
described  earlier).  This  yields  that  each  flow  will  get  Q.{W y^niogn)  throughput. 

In  light  of  the  above,  we  obtain  the  following  theorem; 

Theorem  6.  With  a  random  {c,  f)  channel  assignment,  when  c  =  O(logn),  construction 
CRi  achieves  throughput  ofn{W^f^^)  for  each  flow. 
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4.7  Optimal  Lower  Bound  on  Capacity 


In  this  section,  we  present  a  construction  CR*  that  achieves  n^iog'^n)  throughput  for 

each  flow.  In  light  of  the  upper  bound  of  0{W established  in  Section  4.5,  CR* 
is  optimal  for  the  regime  c  =  O(logn).  This  establishes  the  capacity  with  random  (c, /) 
assignment  as  0 ( W ™  regime  c  =  O(logn). 

We  first  present  a  construction  CR2  that  achieves  Q.{W when  /  >  100  (thus 
necessarily  c  >  100). 

We  now  describe  construction  CR2- 


Subdivision  of  network  region  into  cells  Similar  to  previous  constructions,  the  surface 
of  the  unit  torus  is  divided  into  square  cells  of  area  a(n)  each,  and  the  transmission  range  is 
set  to  Y^8a(n),  thereby  ensuring  that  any  node  in  a  given  cell  is  within  range  of  any  other 
node  in  any  adjoining  cell. 

We  choose  a{n)  =  maxjiog n,c}  _  0('  log”-  ^  (since  c  =  O(logn)). 

'  ’  Prncin  '•Prndn^  b  JJ 

T  rtn  TP  I,  u  u  j.  1  4.  inaln)  200max|logn,c|  j  4  4  6na{n)  300max|logn,c| 

Lemma  27.  hacli  cell  has  at  least  — „  ana  at  most  — 

5  Prnd  5  Prnd 

nodes  w.h.p. 

Proof.  We  have  chosen  a(n)  =  ^^Omaxjiog n,c}  ^  t  \  y  if  c  <  logn,  we  can  set 

a  =  >  1  in  Lemma  59,  and  when  c  >  logn,  i.e.,  c  =  nlogn(  for  some  n  >  1)  (recall 

that  c  =  O(logn)),  we  can  set  a  =  >  1  (noting  that  in  either  case  a  <  for 

large  enough  n),  to  obtain  that  the  following  holds  with  probability  at  least  1  —  for 

all  cells  Ti'. 


250 max{ logn,  c} 

Prnd 


50 logn  <  Pop(?f)  < 


250max|logn,  c| 

- +50  logn 

Prnd 


where  Pop(?f)  denotes  the  number  of  nodes  in  cell  7i. 
Thereafter,  noting  that  ”-’^1  _  59  igg  > 

Prnd 

50  log  n  <  300  maxjiog  n,c}  ^  (jQnipletes  the  proof. 


200  max{log  n,c} 
Prnd 


and 


250  maxjiog  n,c} 
Prnd 


+ 

□ 


The  following  facts  will  also  be  used  extensively  in  subsequent  proofs; 


/ 

^  ^  Prnd  —  1 
C 


(4.13) 


57 


For  large  n,  since  c  =  O(logn),  and  2  <  /  <  c: 


250max{logn,c}  2  x 

na(n)  =  - =  (^(log  n) 


Prnd 


a{n)  1  /250nmax{logn,  c}  I  n  ^ 

c  cy  Prnd  logn^ 

71 A  /  n  ( T)] 

.■.g{n)  =  0{na{n))  g{n)  = 


(4.14) 


Prnd^ 


=  Oi 


y^a(n)  V  max{log  n,  c}  ]j  log  n 


n 


n^^a{n)  1  /250n  max{log  n,  c} 


Prnd 


= 


n 


logn 
ny^  a(n)  ^ 


■■■  =  0(^=)  ^  g{n)  =  0{ 

\/a(n)  c 


(4.15) 


Some  properties  of  SD'D  routing  Recall  that  we  use  the  traffic  model  of  [43],  where 
each  source  S  first  chooses  a  pseudo-destination  D' ,  and  then  selects  the  node  D  nearest 
to  it  as  the  actual  destination.  In  [43],  the  flow  traversed  cells  intersected  by  the  straight 
line  SD\  and  then  took  an  extra  last  hop  if  required  (we  refer  to  this  as  SD'D  routing). 
As  we  will  show  later,  it  may  not  always  suffice  to  use  SD'D  routing.  However,  this  is  still 
an  important  component  of  our  routing  procedure.  We  state  and  prove  certain  relevant 
properties: 


Lemma  28.  Given  only  straight-line  SD'  routing  (no  additional  last-hop),  the  number  of 
flows  that  enter  any  cell  on  their  i-th  hop  is  at  most  w.h.p.,  for  any  i. 

Proof.  Let  us  consider  the  straight-line  part  SD'  of  an  SD'D  route.  All  the  n  SD'  lines 
are  i.i.d.  Denote  by  the  indicator  variable  which  is  1  if  the  flow  k  enters  a  cell  hi  on  its 
f-th  hop.  Then,  as  observed  in  [36]  (proof  of  Lemma  3  in  [36]),  for  i.i.d.  straight  lines,  the 
Xf's  are  identically  distributed,  and  Xf  and  Xj  are  independent  for  k  1.  However,  for 
a  given  flow  k,  at  most  one  of  the  X^^’s  can  be  1  as  a  flow-route  only  traverses  a  cell  once 
along  the  straight  line  SD'.  Then  PT[Xf  =  1]  =  a(n)  =  m^pog n,c}  ^ 

n 

Let  Xj  =  J^Xi.  Then  H[Xj]  =  na{n).  Also,  for  a  given  i,  the  X^^’s  are  independent 
k=l 


58 


[36].  Then  by  application  of  the  Chernoff  bound  from  Lemma  52  (with  /?  =  j): 


Pr[Xi  >  — - — ]  <  exp( - 

1250  max{ log  n,  c},  .  250  maxjlogn,  c} ,  1 

Pr  A j  >  - ; -  <  exp{ - ; -  <  — ^ 

^Prnd  ^8prnd  riP 


(4.16) 


The  maximum  value  that  i  can  take  is 


250mrx{Togn,c}  <  ^Iso  the  number  of 


cells  is  <  n.  By  application  of  union  bound  over  all  i,  and  all  cells  ,  the  probability 
that  Xi  >  is  less  than  and  hence,  the  number  of  flows  that  enter  any  cell  on  any 

hop  is  less  than  =  i250max{iogn,c}  probability  at  least  1 - Since  Xi  is  an 

integer,  this  implies  that  it  is  at  most  w.h.p.  □ 


Lemma  29.  If  a  node  is  destination  of  some  flow,  that  flow’s  pseudo-destination  must  lie 
within  either  the  same  cell,  or  an  adjacent  cell  w.h.p. 


Proof.  It  was  shown  in  the  proof  of  Lemma  1  that  a  flow  will  be  assigned  to  a  destination 
lying  within  a  circle  of  radius  centered  around  the  pseudo-destination  w.h.p. 

Conversely,  if  a  flow  is  assigned  to  a  node,  then  the  pseudo-destination  must  lie  within  a 
circle  of  of  radius  .  /  looiog*^  centered  around  the  node. 

y  Tin 

It  is  easy  to  see  that  a  circle  of  radius  centered  at  a  node  will  fall  completely 

within  the  cells  adjacent  to  the  node’s  cell  (by  our  choice  of  cell-area  a(n)).  Hence,  if  a 
node  is  destination  of  some  flow,  that  flow’s  pseudo-destination  must  lie  within  either  the 
same  cell,  or  an  adjacent  cell.  □ 

Lemma  30.  The  number  of  SD'D  routes  that  traverse  any  eell  is  0{n^,/ a{n))  w.h.p. 


Proof.  Consider  a  cell  7i.  From  Lemma  61  (which  is  obtained  from  a  lemma  in  [36]),  we 
know  that  the  number  of  SD'  straight-lines  traversing  any  single  cell  are  a{n)).  We 

must  now  consider  the  number  of  routes  whose  last  D'D  hop  may  enter  this  cell  7i.  If 
D  is  in  the  same  cell  as  D' ,  there  is  no  extra  hop.  Let  us  now  consider  the  case  that  D' 
lies  in  one  of  the  8  adjacent  cells,  but  D  lies  in  the  cell  7i  (from  Lemma  29,  we  know 
that  D  lies  in  cell  7i  only  if  D'  lies  in  7i  or  its  adjacent  cells).  The  number  of  flows  for 
which  D'  lies  in  one  of  the  8  cells  adjacent  to  TL  is  0{na{n))  w.h.p.  (by  applying  Lemma 
59  to  the  set  of  n  pseudo-destinations).  Also  from  (4.14),  and  the  fact  that  c  >  1,  we 
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know  that  0{na{n))  0{n^^ a{n)).  Therefore,  the  total  number  of  traversing  routes  is 

0{n^Ja{n)).  □ 

Having  stated  and  proved  these  preliminary  lemmas,  we  now  establish  some  proper¬ 
ties  of  the  spatial  distribution  of  channels,  and  thereafter  describe  our  scheduling/routing 
procedure: 

Definition  2.  (Usability  Threshold  for  Channel  Use)  The  usability  threshold  for  channel 
use  denoted  by  M^,  and  M„  =  1  • 

Lemma  31.  If  there  are  at  least  maxjiog  n,c}  every  cell,  of  which  we  choose 

Prnd 

i80mi«{iogn,c}  uniformly  at  random  as  candidates  to  examine,  then,  in  each  cell, 

Prnd 

amongst  those  iso  maxjiog  n,c}  nodes,  at  least  c  —  I  channels  have  at  least  Mu 

Prnd  4  -■ 


candidate  nodes  capable  of  switching  on  them,  w.h.p. 

Proof.  Consider  any  single  cell  7i.  Denote  by  S  the  set  of  iso  maxjiog  n,c}  ^^^^gg  lying  in 
cell  7i  that  are  chosen  uniformly  at  random  for  examination.  Denote  by  Iji  the  indicator 
variable  that  is  1  if  a  node  j  can  switch  on  channel  i  and  0  else.  Then:  Pr[/jj  =  1]  =  y 
and  for  a  given  i,  the  Iji  are  independent.  Xi  =  hi  i®  number  of  nodes  in  S 

capable  of  switching  on  channel  i.  Then  E\Xi\  =  ^  /  iso maxjiog n,c}\ 

\  Prnd  J 

M„  =  r®]. 

In  light  of  Lemma  14,  we  obtain  the  following: 


E[X,]  = 


180/  maxjiog  n,  c} 


^Prnd 


.  ,  180  maxjiog  n,  c}  90  maxjiog  n,  c} 

^[^i\  ^  - ■  ro.c  Cl -  ^  1 - 

mmj2/,  |}  / 

E[Xi]  >  180/  from  (4.17)  (noting  that  pmd  <  1) 


180 maxjiog n,  c|  180 maxjiog n,  c)  Togn  r- 

^  >  90maxj^,  >  900 


minj2/,  f  } 


ogn 


(by  applying  Lemma  15) 


(4.17) 

(4.18) 

(4.19) 

(4.20) 


From  the  preceding  equations,  it  also  follows  that: 


Mu  > 


max 


.45  maxjiog  n,  c|  „  ,  r - 

j - 90/,  4500^} 
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Let  denote  an  indicator  variable  which  is  1  Xi  <  and  0  else.  Applying  the 

Chernoff  bound  in  Lemma  53: 


Pr[/'  =  1]  =  PrK  <  Si  <  p,.|Xj  <  S|  <  exp(-S 


(4.21) 


Besides,  the  I^’s  are  negatively  correlated,  as  each  node  has  a  uniformly  random  subset 
of  /  channels  assigned  to  it,  and  thus,  in  the  given  set  of  nodes  5,  having  some  channel  (say 
i)  assigned  to  a  large  number  of  nodes  can  only  decrease  the  presence  of  another  channel 
(say  k). 


Let  X  =  X]i=i  I'i-  Then,  noting^  that  logc  < 


E[Xi\ 

200 


Vc  >  2; 


E[X]  <  cexp(-® )  =  exp(-®  +  logc)  <  exp(-^^[^*] 


25  ^ 

(•.•  logc  <  \/c  >  2) 

^  ^  “  200  -  ^ 


(4.22) 


Due  to  the  negative  correlation  of  /'’s,  we  can  still  apply  the  Chernoff  bound  (see 
Lemma  55).  By  setting  {1  +  f5)E[X\  =  |  in  Lemma  51  (from  (4.22)  E[X]  <  exp(— ^^^^)  < 
exp(— ^(180/))  <  yielding  (5  >  0),  we  obtain  by  appropriate  substitutions  at  each  step, 
the  following; 


-/- 


/i 


Pr[X>rjll<Pr[X>||<^^^^:^j 


=4 


E[X\ 


< 


1  +  /3 


{i+mm 


< 


4eexp(-^  /90ma^4k^ 


from  (4.18)  and  (4.22) 


4^exp(_?Z0LMj2Elvi) 

/ 

f  — 270  max{log  n,c} 
exp  ^  jQQ 

exp  (—2.7  maxjlogn,  c}) 


< 


From  (4.20):  ^  >  Dgc  whenever  c  >  2. 


lOv^ 


10^2 
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(  since  /  >  2) 


exp  (— 2.7max{logn,  c}) 

f 

<  exp(— 2.7max{logn,  c})  exp(  — )  (4.23) 

<  exp(— 2  max{logn,  c})  <  ^  (  since  f  <  c) 

71^ 


Applying  the  union  bound  over  all  <  n  cells  in  the  network,  the  probability  that  this 
happens  in  any  cell  is  at  most  ^  .  Thus,  with  probability  at  least  1  —  X  <  [|],  i.e., 
X  <  \_j\  (since  X  is  an  integer).  Hence,  each  cell  has  at  least  c  —  channels  with 
Xi  >  2  "  candidate  nodes  capable  of  switching  on  them.  Therefore,  from  the  definition 

of  X,  each  cell  has  at  least  c  —  [|j  channels  with  Xi  >  candidate  nodes  capable  of 

switching  on  them  (since  Xi  is  also  an  integer).  From  (4.17)  and  the  definition  of  M^,  we 
know  that  Mu  =  [ ■  This  proves  the  result.  □ 


Similar  to  the  proof  of  Theorem  5,  the  approach  involves  constructing  a  routing  struc¬ 
ture  (backbone)  for  each  node.  However,  in  this  case,  we  only  need  to  construct  routing 
structures  that  can  provide  a  route  between  the  n  chosen  SD  pairs,  and  not  all  node  pairs. 
Thus,  the  constructed  backbones  are  partial  backbones  in  that,  unlike  the  proof  of  Theorem 
5,  they  do  not  cover  all  cells  in  the  network.  Moreover,  since  our  concern  is  not  merely 
connectivity  but  also  capacity,  these  partial  backbones  need  to  he  constructed  carefully,  to 
ensure  that  no  bottlenecks  are  formed. 

Similar  to  the  proof  of  Theorem  5,  we  begin  by  classifying  all  nodes  as  either  backbone 


candidates  or  transition  facilitators. 

Conditioning  on  Lemma  27,  there  are  at  least  nodes  in  each  cell  w.h.p. 

Prnd 

Initially,  in  each  cell,  we  choose  ”■’’^1  nodes  uniformly  at  random  as  backbone 

Prnd 

candidates.  The  remaining  nodes  (which  are  at  least  maxjiog n,c}  number)  are  deemed 

Prnd 

transition  facilitators.^ 


We  next  define  a  notion  of  a  channel  being  proper  in  a  cell; 

®The  number  of  nodes  in  either  category  must  be  an  integer.  Here,  for  simplicity  we  assume  that  we  can  in¬ 
deed  select  exactly  i80max{iogn,e}  backbone  candidates  and  the  remaining  nodes  are  at  least  max{iogti,c} 

Prnd  Prnd 

If  these  two  quantities  are  not  integers,  but  one  can  select  at  least  p  "'^1  ~|  backbone  candidates  and 

still  have  at  least  r  "'^1  nodes  left  as  transition  facilitators,  the  results  will  evidently  continue  to 

hold.  It  is  also  possible  to  conceive  of  a  scenario  where  there  are  exactly  nodes  in  the  cell,  but 

Prnd 

180  maxlloe:  n.cl-  i  20  max-Tlog  n.cl  j.  •  i  r  i  •  i  j_  r  180  max -floe:  n, cl- -i  i 

- ^ — —  and  - — —  are  not  integers.  In  such  a  scenario,  one  can  select  - ^  nodes 

Prnd  Prnd  °  '  Prnd  ' 

as  backbone  candidates  and  j  transition  facilitators,  without  affecting  the  results  (except 

for  a  minor  change  in  the  probability  calculations  involving  transition  facilitators). 
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Definition  3.  (Proper  Channel)  A  channel  i  is  deemed  proper  in  cell  TL  if  at  least  Mu 
backbone  candidate  nodes  in  TL  are  capable  of  switching  (operating)  on  it. 

Note  that  being  proper  is  a  property  defined  with  respect  to  a  specific  cell,  i.e.,  a  channel 
can  be  proper  in  one  cell  and  not  proper  in  another. 

Lemma  32.  For  each  cell  of  the  network,  the  following  is  true  w.h.p.:  if  the  number  of 
proper  channels  in  the  cell  is  d ,  then  d  >  c  —  \_^\  >c— [|J  >  >^. 

Proof.  The  proof  follows  from  Lemma  27  and  Lemma  31.  □ 

We  now  prove  a  property  that  plays  an  important  role  in  proving  that  traffic  load  can 
be  distributed  without  creating  bottlenecks; 

Lemma  33.  ^ 

Consider  a  cell  TL.  Let  Wi  be  the  set  of  all  nodes  in  the  8  adjacent  cells  TL{k),l  <  k  <  8, 
that  are  capable  of  switching  on  channel  i. 

For  a  set  of  nodes  B,  define  CniB)  as: 

CniB)  =  {j\j  proper  in  TL  and  3u  G  B  capable  of  switching  on  j} 

Iff  >  100,  the  following  holds  w.h.p.: 

dTL,  V  channels  i,  \/B  C  Wi  such  that  \B\  =  >  1"^] 

Proof.  We  condition  on  the  node-locations,  and  their  conforming  to  the  high-probability 
event  of  Lemma  27.  Consider  a  cell  TL.  Let  d  be  the  number  of  proper  channels  in  TL. 

Having  conditioned  on  (and  thus  fixed)  the  node-locations  (and  thereby  node-population 
in  each  cell),  channel-presence  in  each  cell  is  independent  of  other  cells,  as  channel  assign¬ 
ment  is  done  independently  for  each  node. 

Then  we  can  show  that:  d  >  c  —  >  c  —  [|J  >  with  probability  at  least 

1  ~  by  following  the  proof  argument  of  Lemma  31  up  to  (4.23)  (just  prior  to  application 

of  the  union  bound  over  all  cells  in  the  proof  of  that  lemma). 

^This  can  be  viewed  as  a  special  variant  of  the  Coupon  Collector’s  problem  [83],  where  there  are  c  different 
types  of  coupons,  and  each  box  has  a  random  subset  of  /  different  coupons.  Some  other  somewhat  different 
variants  having  multiple  coupons  per  box  have  been  considered  in  work  on  coding,  e.g.,  [33]. 
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If  c'  <  ^,  then  we  assume  that  our  desired  event  does  not  happen  for  the  purpose  of 
obtaining  a  bound.  This  probability  is  at  most 
We  now  focus  on  the  case  where  >  ^. 

Consider  a  particular  channel  i. 

Recall  that  Wi  is  the  set  of  nodes  in  the  cells  adjacent  to  7i  that  are  capable  of  switching 
on  channel  i. 

We  first  bound  the  probability  that  \Wi\  >  2400e^  maxjlog  n,  c}. 

Let  Yij  be  an  indicator  variable  that  is  1  if  node  j  in  cells  adjacent  to  7i  is  capable  of 

switching  on  channel  i,  and  0  else.  Then  we  know  that  Vi\Yij  =  1]  =  ^,  and  for  a  given  i, 

8 

the  Yij's  are  independent.  Let  Yi  =  ^ij  (recall  that  7i{k),  1  <  k  <  8,  are  the  cells 

k=ljG'H{k) 

adjacent  to  7i).  Since  the  node- locations,  and  hence  cell-populations,  conform  to  the  high 

probability  event  of  Lemma  27,  therefore:  E[Yi\  <  8  ^  ^fna{n)  ^  ^  ^  250/  maxjiog n,c}  ^  _ 

2400/ max pogn,c}  ^  +  j3)E\Yi]  =  2400e^  maxjlog  n,  c} ,  observing  from  (4.13)  that 

2 

(5  >  _  1  >  0  and  applying  the  Chernoff  bound  from  Lemma  51; 


Denote  by  the  event  that,  for  given  i  and  7i:  3B  C  Wj  such  that  \B\  =  \  and 

\C-}i{B)\  <  1"^].  Let  Pub{x)  be  an  upper-bound  on  Pr  iWjl  =  x,c'  >  ^  .  Note  that, 
having  conditioned  on  (and  hence  fixed)  the  node- locations,  \Wi\  is  independent  of  whether 
c'  >  ^  or  not. 
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If  Pub{x)  is  a  non-decreasing  function  of  x,  then  the  following  holds; 


Pr 


.  3c 

4 

=  Pr 

V' 

m\  <  >  j 

Pr 

Bipi 

m\<b,B>- 

-|-  Pr 

m  >  6|c'  >  1 

Pr 

Bi,H 

m\>b,c'>^ 

<  Pr[|>Vi|  <  6]Pr 


i,n 


m\<b,£>- 


^Pr[|>Vi|  =  x]Pr 


x<b 


1.  .  .  3cl 

Bi,n 

II 

IV 

+  Pr[|>Vi|  >  b] 

+  Pr[|>Vi|  >  b] 


<  ^Pr  [|>Vi|  =  x]pub{x)  +  Pr  [\yVi\  >  b] 

x<b 

<Y,Pv[\Wi\=x]pub{b)+Fv[\Wi\>b] 

x<b 

=  Pub{b)Y^  Pr  [|W.|  =  x]  +  Pr  [|W.|  >  h] 

x<b 


=  Pub{b)PJ:[\W^\<b]+Pr[\Wi\  >  b] 
<  Pub{b)  +  Pr  [\Wi\  >  b] 


(4.25) 


We  now  find  such  an  upper-bound  Pub(x)  that  is  a  non-decreasing  function  of  x: 

Note  that  we  only  need  to  explicitly  consider  x  >  \  1  >  there  exist  no  subsets 

B  C  Wi  satisfying  \B\  =  |~ ] ;  thus  the  event  cannot  occur,  and  trivially:  Pub{x)  =  0 
for  0  <  X  <  I"  ] . 

If  \Wi\  =  X  >  I" ] ,  then  from  Lemma  62,  the  number  of  subsets  of  Wi  of  cardinality 
™  =  r^l  is  given  by:  C)  <  (g)™. 

Consider  a  subset  B  C  Wi  of  specified  cardinality  m  =  \  ]  •  Denote  by  Xj  the 

indicator  variable  which  is  1  if  channel  j  is  not  a  member  of  C'n{B)  and  0  else. 

Recall  that  each  node  in  B  has  one  channel  known  to  be  i,  but  the  remaining  /  —  1 
channels  assigned  to  it  are  an  i.i.d.  chosen  subset  from  the  remaining  c— I  available  channels. 
Thus: 


Pr[x  G  Wj{j  /  i)|x  G  Wi] 


/-I  ^  /-I 

c  —  1  ~  c 


>  ^ 
-  100c 


(•■•  /  >  100) 


(4.26) 


From  (4.26),  Pr[Xj  =  1] 


(1  -  <  (1 


99/  rT^i 
rood 


_  99/  V  fna(n)-. 
<  e  100c  I  4c  I 


(applying 
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Fact  2).  Furthermore,  for  a  given  B,  the  Xj’s  are  negatively  correlated. 

_  ,  99  f  r  fna(n)-i  / 

Let  X  =  Xj.  Then  E[X]  <  4c  1.  Setting  (1  +  P)E[X]  =  one 

j  proper  in  Ti 

99/^Tia(n) 

can  see  that  (3  =  -  1  >  TTTWTI^T^  “  1  ''  ^ 


2c' e  lUUc  ^  5^ 


-1  > 


—  2 


—  1  >  0  (recall  that 


na{n)  =  >  250cmax{iogn,c}  ^  i25c^  ^  from  Lemma  14).  Hence,  we  can  apply 


Prnd  2/  / 

the  Chernoff  bound  from  Lemma  51  to  obtain  that: 


Pr[X  >  -]  < 


(l  +  /3)(i+/5) 

) 


.E[X] 

)  " 


(1  +  (3) 


{l+li)E[X] 


f  2eE[X]\  2  ^  /  2ec^exp(-^r 


99.f 


In  r  99f  fna(n)  \  ^ 

,  ,  99/  fnain)-,  ,  , 

=  «p(-T7Tr^T^i  +  (i  +  ii2)) 


<  I  exp 


=  exp 
<  exp  I  — 


100c  4c 

99/  /na(n),  ,  ,  . 

lOOc '  4c  '  ^  ^ 

297 f  fna{n)  3c(l  +  ln2) 
800  ^  4c  ^  8 

297 f  fna{n)  4/  fna{n) 


99/  fna(n) . 
100c  4c 


/  3c  \ 

+  (1  +  ln2)  <0  and  c  >  —  J 


800 


P 


4c 


+  —  r 

'  1  or  I 


125 '  4c 


/  250  max{log  n,  c}  250c  max{log  n,  c}  3c(l  +  log2)  4/  fna{n)  \ 

= - TX: - - - 7p - i 

265/  fna{n) 


<  exp  — 


800 


4c 


(4.27) 


Due  to  integrality  of  X,  X  <  ^  X  <  [|-J 


\cn(B)\  >  rsi  >  rf  1 


Taking  the  union  bound  over  all  possible  subsets  B,  we  obtain  that  the  probability  it  hap¬ 
pens  for  any  such  subset  B  is  at  most  (^)”^  exp(— ] )  which  is  an  increasing  func¬ 
tion  of  X  (recall  that  m  =  |~ ] ) .  Thus  we  obtain:  Pub{x)  =  exp(— ) 

for  X  >  \ 1  •  Resultantly,  Pub{x)  is  an  increasing  function  of  x. 
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For  b  =  2400e^  max{logn,  c}: 


Pub{b)  =_p„6(2400e^max{logn,c}) 

|-/na(n)-| 

^  2400e^  max{logn,  c}  ^  ^  265f  ^fna{n)-^ 

1  onn  I  „  I 


V 


|-/ria(n)-| 


< 


^  2400e^  max{log  n,  c} 


fna(n) 


fna{n) 

4c 


exp  - 


800  '  4c 
265/  |-/na(n) 


-r- 


^  /  9600e^cpr«d 
"  V  250/ 


r  fna(n)  -| 


exp  - 


800  '  4c 
265/  fna{n). 


800 


r- 


4c 


^  cprnd.rfna{n)  \  f  265/  /no(n) 

<  exp  I  (3  +  log  —  +  log  — )  r j  exp  —  r  ^—1 

<  exp  [  (3  +  log 40  +  log2/)r'^”/^M  exp  (using  Lemma  14) 


Note  that: 


4c 


800 


4c 


V  /  >  100  :  /  >  8(3  + log  40  + log  2/) 


(4.28) 

(4.29) 


Therefore: 


Pub{b)  <  exp  ( 


exp 


=  exp 
<  exp  I  — 


165/  |-/na(n) 


800  '  4c 
f‘^na{n) 
20c 


265/  |-/na(n) 

'  800  ^  4c 

f  /r /««(«) 


,  125  log  n\  1 

<  exp - <  — ^ 

“  ^  '  20  / 


(4.30) 


(  from  Lemma  14  and  our  choice  of  a(n)) 


From  (4.24),  (4.25),  and  (4.30):  Pr[Tg,„|c'  >  f  ]  <  p,gb(6)  +  Pr[| Wg|  >  6]  <  ^  < 

1 

* 

Since  there  are  c  =  O(logn)  channels  i  to  consider,  we  take  a  union  bound  over  them 
to  obtain  that: 


3c  3c 

Vv[£i^-}i  for  any  i  in  'H\c  >  — ]  <  cVv[£i^-}i  for  a  given  i  in  'H\c!  >  — ]) 
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Thus: 


Y’i[£i;yi  for  any  i  in  ]  <  Pr[c'  <  ^]  +  -Prfc'  >  —  ](cPr[<£’j^'>^  for  a  given  i  in  T-i\c  >  — ]) 

3c  3c  1  c 

<  Pr[c'  <  — ]  +  cPriTj  for  a  given  i  in  |c'  >  — ]  <  ^  H - f 

4  ’  4  n® 

We  take  another  union  bound  over  all  -tC  =  ^  cells  7{  to  obtain  that 

Oil  77- 1  lilg-X'l  lO^  77-^Cj  C- 

the  probability  this  occurs  in  any  cell  is  at  most  ^ 

Finally,  recall  that  we  conditioned  our  proof  on  the  node- locations  conforming  to  the 
high-probability  event  of  Lemma  27.  The  probability  that  this  event  does  not  occur  is 
at  most  (as  proved  in  Lemma  27),  and  we  can  obtain  a  bound  by  assuming  that 

whenever  that  event  fails  to  hold,  the  event  in  the  statement  of  this  lemma  fails  to  hold. 
This  completes  the  proof  that  C{B)  >  c'  —  [|-J  >  \^'\  >  for  all  specified  subsets  B  of 

interest,  for  all  channels  i,  and  in  all  cells  77  with  probability  at  least  1  ~  ^  ~  ^  ~  > 

2  _  g  _  50  log  n  Q 

n  n 

4.7.1  Routing  and  Channel  Assignment 

There  are  two  inter-related  aspects  of  the  routing  procedure:  determining  the  sequence 
of  cells  a  route  should  traverse,  and  finding  a  feasible  sequence  of  nodes/links  along  that 
sequence  of  cells  which  provides  an  end-to-end  route  from  source  to  destination,  while 
avoiding  bottleneck  formation. 

We  begin  by  addressing  the  issue  of  finding  a  feasible  sequence  of  nodes/links  that 
can  provide  an  end-to-end  route  from  source  to  destination,  given  a  sequence  of  cells  to 
traverse.  We  introduce  routing  structures  that  can  facilitate  this.  We  then  show  that  if 
the  number  of  cells  traversed  is  at  least  a  certain  minimum  number,  then  an  end-to-end 
feasible  route  can  be  found,  and  describe  a  method  of  choosing  the  cell-sequence  for  each 
route.  Thereafter  we  address  the  issue  of  constructing  the  routing  structures  in  a  manner 
that  ensures  load-balance. 

Partial  Backbones  The  routing  strategy  is  based  on  constructing  source  and  destina¬ 
tion  routing  structures,  in  a  manner  similar  to  the  backbones  used  to  prove  the  sufficient 
condition  for  connectivity.  However,  instead  of  constructing  a  full  backbone  for  each  node 
covering  each  cell  of  the  network,  only  a  partial  backbone  is  constructed  for  each  node  x. 
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The  partial-backbone  of  a  node  x  is  denoted  by  Bp{x). 

Bp{x)  comprises  a  source  segment  Sb{x)  for  the  flow  for  which  x  is  the  source.  It  also 
comprises  a  collection  T’fe(x)  of  destination  segments  'D^\x)  for  each  flow  i  for  which  x 
is  the  destination.  Sb{x)  expands  outwards  from  x  to  cover  the  sequence  of  cells  on  the 
route  from  x  to  its  destination  in  that  very  order.  Thus,  there  is  a  path  comprising  nodes 
and  links  in  Sb{x)  from  x  to  any  node  Qx  £  Sb{x)  that  follows  the  exact  sequence  of  cells 
traversed  by  the  route  of  x’s  flow,  up  to  qx’s  cell.  Each  '  expands  outwards  from  x  to 
cover  the  cells  on  the  route  (in  reverse  order)  from  the  source  of  flow  i  to  x.  Thus,  there  is  a 
path  from  x  to  any  node  qx  £  (x)  that  follows  the  reverse  sequence  of  cells  traversed  by 

the  route  of  flow  i  up  to  qx's  cell  (correspondingly,  the  path  from  qx  £  'D^^\x)  to  x  follows 
the  sequence  of  cells  traversed  by  flow  i’s  route  along  that  stretch). 

Note  that  each  segment  is  a  collection  of  nodes  (V)  and  links/edges  (E)  between  some 
of  these  nodes.  Thus5fe(x)  =  {V{Sb{x)),  E{Sb{x))),  andV^^\x)  =  {V{V^^\x)),  E{'D^^\x))). 
Since  we  are  concerned  with  load-balance,  each  link  also  has  an  assigned  channel  of  operation 
(from  amongst  all  feasible  channels  for  that  link). 

Also  note  that  some  of  the  segments  above  may  traverse  common  cells.  In  particular,  x’s 
cell  is  common  to  all  segments,  x  is  a  default  member  of  its  own  backbone,  and  all  backbone 
segments.  If  two  or  more  segments  have  a  common  cell  other  than  x’s  cell,  it  is  acceptable  for 
each  segment  to  have  a  different  backbone  node  in  that  cell  (and  correspondingly  different 
incoming/outgoing  backbone  links),  if  needed.  Nodes/links  may  also  be  common  to  the 
segments  if  it  is  feasible  while  ensuring  that  each  segment  traverses  the  stipulated  sequence 
of  cells. 

The  initial  part  of  the  route  of  a  flow  i  with  source  x  and  destination  y  is  along  the  links 
of  the  source  backbone  segment  Sb{x).  As  it  approaches  the  destination,  it  then  attempts 
to  find  a  transition  point  and  move  onto  the  destination  backbone  segment  'D^\y)  (Fig. 
4.2). 

In  light  of  the  preceding  lemmas,  is  easy  to  see  that  it  is  indeed  always  feasible  to 
construct  each  segment  of  Bp{x)  for  all  nodes  x:  Consider  a  node  in  some  cell  of  the  network 
which  is  the  current  terminus  of  the  backbone-segment  under  construction.  It  needs  to  find 
a  node  in  the  next  cell  to  be  filled  such  that  it  can  communicate  with  that  node.  The  node 
can  switch  on  /  channels.  From  Lemma  32,  at  least  /  —  [ of  these  /  channels  are  proper 
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Figure  4.2:  Illustration  of  routing  along  backbones 


in  the  next  cell,  and  therefore  there  are  at  least  nodes  in  that  cell  capable  of  switching 
on  each  of  these  channels  w.h.p.  In  light  of  this  it  is  always  possible  to  expand  the  segment 
further. 

However,  our  goal  is  more  than  just  connectivity,  and  the  backbone  segments  must 
be  constructed  in  a  manner  that  avoids  bottleneck  formation.  We  will  later  describe  a 
backbone  construction  procedure  that  ensures  load-balance.  First  we  prove  that,  given 
any  set  of  feasible  backbones,  it  is  possible  to  find  an  end-to-end  feasible  route  along  the 
backbone  segments  from  the  flow’s  source  to  its  destination. 

Lemma  34.  Suppose  a  flow  i  has  source  x  and  destination  y.  As  described  previously, 
the  flow’s  packets  are  initially  sent  on  segment  Sh{x)  of  Bp{x)  and  eventually  need  to  tran¬ 
sition  onto  segment  of  Bp{y)  (to  reach  y).  After  having  traversed  distinct 

intermediate  cells^  (hops)  while  seeking  a  transition  opportunity,  the  flow  will  have  found 
an  opportunity  to  make  this  transition  w.h.p.  If  the  routes  of  each  of  the  n  flows  get  to 
traverse  at  least  distinct  intermediate  cells  (note  that  each  individual  flow’s  route 

needs  to  traverse  at  least  so  many  distinct  cells;  two  different  flows  may  share  cells  on  their 
respective  routes),  then  all  n  flows  are  able  to  transition  w.h.p. 

®The  cells  must  be  chosen  in  a  manner  independent  of  channel  presence  in  the  cells. 
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Proof.  Consider  a  flow  traversing  a  sequence  of  cells  Ti.i,7i2,  ■■■■  If  the  representative  of  Si,{x) 
(let  us  call  it  qx)  in  7ij  can  communicate  (directly  or  indirectly)  with  the  representative 
of  V^\y)  (let  us  call  it  Qy)  in  Tij,  it  is  possible  to  transition  from  Sb{x)  to  If  qx 

and  qy  can  operate  on  some  common  channel,  this  is  trivially  possible.  If  qx  and  qy  do  not 
operate  on  a  common  channel,  we  consider  the  probability  that  the  two  can  communicate 
via  a  third  node  from  amongst  the  transition  facilitators  in  Tij,  i.e.  there  exists  a  transition 
facilitator  z  such  that  z  shares  at  least  one  channel  with  qx  and  one  channel  with  qy.  In 
Section  4.4.2,  we  showed  that  if  qx  and  qy  are  incapable  of  direct  communication,  then 
they  can  communicate  through  a  given  z  with  probability  pz  >  ^40^-  Given  our  choice 
of  cell  area  a(n),  and  conditioned  on  the  fact  that  each  cell  has  at  least 
nodes  (Lemma  27),  of  which  isomaxjiog  n,c}  deemed  backbone  candidates  and  the  rest  are 
transition  facilitators,  there  are  at  least  ™a,x{iogn,c}  ^  ^  possibilities  for  2:  within  that 

cell  (since  these  cells  are  intermediate  cells,  i.e.,  do  not  include  the  cells  in  which  x  and  y  lie 
respectively,  qx  and  qy  themselves  must  be  backbone  candidates).  All  the  possible  2  nodes 
have  i.i.d.  channel  assignments.  Thus,  the  probability  that  qx  and  qy  cannot  communicate 

20  log  n 

through  any  z  in  the  cell  is  at  most  (1  —  Pz)  ,  and  the  probability  they  communicate 

20  log  n 

through  some  z  is  Pxy  >  1  —  (1  —  Pz)  • 

Hence,  the  probability  that  this  happens  in  none  of  the  distinct  intermediate  cells 

,  .  ,  80  log  n  80  log  n 

f  ^  ^  ^  ^ ,  T}^  ^ 80  log  n  -f 

is  at  most  (1  —  pxy)  <  (1  —  Pz)  ^  (I - lo^)  ^  ^  ®  ^  (applying 

Fact  2).  Applying  the  union  bound  over  all  n  flows,  the  probability  that  all  flows  are  able 
to  transition  is  at  least  1  —  □ 

n 

Therefore,  we  would  like  each  route  to  traverse  at  least  [^—1  distinct  intermediate  cells 
(hops)  to  be  able  to  find  a  transition  point  from  the  source-backbone  to  the  destination 
backbone. 

If  the  straight-line  SD'D  path  for  a  flow  (Fig.  4.3)  comprises  h  >  1  distinct 

intermediate  cells,  it  suffices  to  use  this  route.  If  S  and  D'  (hence  also  D)  lie  close  to  each 
other,  the  hop-length  of  the  straight  line  cell-to-cell  path  can  be  much  smaller.  In  this  case, 
a  detour  path  SPD'D  is  chosen  (Fig.  4.4)  in  a  manner  similar  to  the  previously  described 
constructions,  by  choosing  a  point  P  on  the  circumference  of  a  circle  of  radius  ^j^r(n) 
centered  at  S.  Since  r(n)  =  y^8a(n),  it  is  easy  to  see  that  the  SP  segment  will  traverse  at 
least  r -^1  distinct  intermediate  cells. 

Prnd 
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Figure  4.3:  Routing  along  a  straight  line  Figure  4.4:  Illustration  of  detour  routing 

The  need  to  perform  detour  routing  for  some  source-destination  pairs  does  not  have  any 
substantial  effect  on  the  relaying  load  on  a  cell. 

Lemma  35.  If  the  number  of  flow-routes  traversing  in  any  cell  is  x  when  all  flows  use 
straight-line  routing,  it  is  at  most  x  x  +  O(log^n)  w.h.p.,  when  detour 

^  rnd 

routing  is  used  for  some  of  the  flows  as  previously  described. 

Proof.  Recall  that  c  =  O(logn).  Since  the  detour  occurs  only  up  to  a  circle  of  radius 
-^—r(n),  the  extra  flow-routes  that  may  pass  through  a  cell  (compared  to  straight-line 
routing)  are  only  those  whose  sources  lie  within  a  distance  -^—r(n)  from  some  point  in 

Prnd 

this  cell.  All  such  possible  sources  fall  within  a  circle  of  radius  (1  -|-  and  hence 

area  afln)  =  &(’'  Applying  Lemma  60  to  the  set  of  n  node  locations  (with  a  suitable 

^rnd 

choice  of  a{n)  >  1),  with  high  probability,  any  circle  of  this  radius  will  have  0{nac{n)) 
nodes,  and  hence  0{nac{n))  sources.  Hence,  the  number  of  extra  flows  that  traverse  the 
cell  due  to  detour  routing  is  0{nac{n)),  and  each  detour-routed  flow’s  route  can  traverse  a 
cell  at  most  twice  along  the  SPD'  stretch.  Note  that  the  possible  additional  last  hop  for 
each  flow  is  already  accounted  for  in  x.  Thus,  the  total  number  of  flow-routes  (counting 
repeat  traversals  separately)  x  -\-  q("^^ W)_  Since  nr‘^{n)  =  0(^^),  and  pmd  >  I’ 
total  number  of  flow- routes  is  x  -I  0(  2^^^)  X -I- 0(log^  n)  w.h.p.  □ 

Flow  Transition  Strategy  From  Lemma  34,  we  know  that  if  each  flow  is  able  to  inspect 
r^— 1  distinct  intermediate  cells,  a  transition  opportunity  will  be  found  by  all  flows  w.h.p. 
In  light  of  this,  we  use  a  procedure  in  which  there  are  two  phases  associated  with  the  route 
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of  a  flow.  A  non-detour-routed  flow  is  initially  in  a  progress-on-source-backbone  phase, 
during  which  its  packets  are  sent  along  the  links  of  the  source  backbone  till  there  are  only 
distinct  intermediate  cells  left  to  the  destination.  At  this  point,  it  enters  a  ready-for- 
transition  phase,  and  seeks  a  transition  to  the  destination  backbone  along  the  remaining 
hops.®  Once  it  has  been  able  to  make  the  transition  onto  the  destination  backbone,  it 
proceeds  towards  the  destination  on  that  backbone  along  the  remaining  part  of  the  route, 
and  is  thus  guaranteed  to  reach  the  destination. 

A  detour-routed  flow  is  always  in  ready-for-transition  phase. 

Lemma  36.  The  number  of  flow-routes  traversing  any  cell  in  ready-for-transition  phase 
(counting  repeat  traversals  separately)  is  O(log^n)  w.h.p. 

Proof.  First  let  us  account  for  the  SD'  stretch  of  each  flow’s  route,  without  considering  the 
possible  additional  last  hop.  We  account  for  it  explicitly  later  in  this  proof. 

In  our  construction,  a  non-detour  routed  flow  enters  the  ready-for-transition  phase  only 
when  it  is  distinct  intermediate  hops  away  from  its  destination.  All  such  flows  must 

have  their  pseudo-destinations  within  a  circle  of  radius  centered  in  the  cell.  The 

number  of  pseudo-destinations  that  lie  within  a  circle  of  radius  from  the  cell  is 

Q^nr^(n)^  0(||logn)  W.h.p.,  (by  observing  that  pmd  >  and  using  suitable  choice 

of  a{n)  in  Lemma  60).  Also  c  =  O(logn).  Hence  there  are  O(log^n)  non-detour-routed 
flows  in  ready-for-transition  phase  traversing  the  cell  w.h.p. 

A  detour-routed  flow  is  always  in  ready-for-transition  phase.  By  Lemma  35,  there  are 
0(log^  n)  such  flows  traversing  any  cell.  Each  such  flow  can  only  traverse  a  cell  twice  along 
the  SD'  (more  precisely  SPD')  stretch.  This  yields  0(log^  n)  detour-routed  flows  (including 
repeat  traversals). 

The  cell  may  also  be  traversed  by  some  of  the  above  flows  (both  non-detour-routed  and 
detour- routed)  on  their  additional  last  hop.  From  Lemma  29,  the  pseudo-destinations  of 
such  flows  must  lie  in  the  same  cell  or  one  of  the  8  adjacent  cells.  Applying  Lemma  59  to 
the  set  of  n  pseudo-destinations,  the  total  number  of  pseudo-destinations  lying  in  these  9 
cells  is  0{na{n))  w.h.p.  Thus,  the  number  of  flows  entering  the  cell  on  their  additional  last 
hop  is  0{na{n))  O(log^n)  w.h.p. 

®This  also  implies  that  it  would  suffice  to  construct  each  destination  backbone  segment  'D()\x)  for  a  node 
X  only  upto  this  distance  outwards  from  *. 
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Figure  4.5:  Cell  7i  and  neighboring  cells  during  backbone  construction 

Hence,  the  number  of  flow-routes  in  ready- for-transition  phase  in  any  cell  is  0(log^  n) 
w.h.p.  □ 

Backbone  Construction  We  now  describe  the  procedure  for  constructing  the  backbone 
Bp{x)  of  X. 

Given  a  cell  7i,  the  8  cells  adjacent  to  cell  7i  are  denoted  as  1  <  j  <  8  (Fig.  4.5). 
Bp{x)  is  constructed  as  follows; 

X  is  by  default  a  member  of  Bp{x).  As  described  earlier,  Bp{x)  has  a  source-segment  Sh{x) 
and  a  collection  of  destination  segments  T)^\x)  for  each  flow  for  which  x  is  a  destination. 

Recall  that  Sb{x)  comprises  the  SD'  route  from  x  to  its  destination,  and  may  also 
have  an  additional  last  hop  to  D  if  needed.  However,  from  Lemma  29,  the  only  such  last 
hop  routes  that  may  enter  a  cell  correspond  to  pseudo-destinations  in  the  8  adjacent  cells. 
Applying  Lemma  59  to  the  set  of  pseudo-destinations,  they  are  only  0{na{n))  such  pseudo¬ 
destinations,  and  hence  only  0{na{n))  such  last-hop  flows  entering  the  cell.  These  can  be 
accounted  for  separately.  Therefore,  we  first  consider  the  construction  of  the  SD'  part  of 
Sb{x)  for  each  node  x. 

Construction  of  Sb{x)  Recall  that  we  are  only  constructing  the  SD'  part  and  not  con¬ 
sidering  the  possible  additional  last  hop  at  this  stage. 
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This  has  two  sub-stages.  In  the  first  sub-stage,  we  construct  backbones  for  source  nodes 
whose  flow  does  not  require  a  detour.  In  the  second  sub-stage  we  construct  backbones  for 
source  nodes  whose  flow  requires  a  detour. 

Straight-line  backbones: 

For  each  source  of  a  non-detour-routed  flow,  the  SD'  segment  of  the  route  comprises 
the  cells  intersected  by  the  straight-line  SD' .  One  can  define  an  ordering  on  these  cells  that 
reflects  the  order  in  which  each  cell  is  encountered  when  moving  from  S  to  D'  along  the 
straight-line.  The  backbone-segment  Sb{x)  is  expanded  into  new  cells  in  the  same  order. 

This  step  proceeds  in  a  synchronized  hop-by-hop  manner  for  all  non-detour-routed  flows 
(each  of  which  has  a  unique  source  x). 

Any  cell  of  Sb{x)  in  which  there  is  already  a  node  assigned  to  Sb{x)  is  called  a  filled 
cell.  Thus,  initially  x’s  cell  is  filled.  We  consider  the  cell  in  Sb{x)  that  is  entered  next  by 
the  flow’s  straight-line  route.  We  consider  all  nodes  in  that  cell  that  can  operate  on  one 
or  more  common  channel  with  x.  This  provides  a  number  of  alternative  channels  on  which 
the  flow’s  backbone  can  enter  that  cell. 

Let  hmax  be  the  maximum  hop-length  of  any  non-detour-routed  SD'  route.  Then, 
hmax  =  0(—J=)  and  the  procedure  has  hmax  steps.  In  step  k,  for  each  source  node  x 
whose  flow  has  k  or  more  hops,  Sb{x)  expands  into  the  cell  entered  by  x’s  flow  on  the  fc-th 
hop. 

Each  cell  7i  performs  the  procedure  we  will  now  describe. 

Lemma  37.  If  f  >  100,  then  it  is  possible  to  devise  a  backbone  construction  procedure,  such 
that,  after  step  hmax  of  the  backbone  construction  procedure  for  the  SD'  part  of  Sb{x)  (for 
sources  x  whose  flows  are  not  detour-routed),  each  cell  has  incoming  backbone 

links  on  a  single  channel,  and  each  node  appears  on  (source)  backbones,  w.h.p. 

Proof.  We  describe  such  a  backbone  construction  procedure  and  prove  its  load-balance 
characteristics  by  induction. 

We  remark  at  the  outset  that  the  proof  is  conditioned  on  the  occurrence  of  the  high 
probability  events  in  Lemma  27,  Lemma  28,  Lemma  32,  and  Lemma  33. 

Recall  that  we  are  expanding  backbones  to  cover  cells  in  Sb{x). 

At  each  step  of  the  construction,  we  first  have  a  channel- allocation  phase,  followed  by  a 
node- allocation  phase.  We  prove  that  after  step  k  of  the  backbone  construction  procedure. 
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the  following  two  invariants  hold  for  all  cells  of  the  network: 


•  Invariant  1:  Each  node  is  assigned  at  most  14  new  incoming  backbone  links  during 

step  k.  Thus  after  step  k,  it  appears  in  a  total  of  0(14/c)  0{k)  backbones. 

•  Invariant  2:  No  more  than  backbone  links  enter  the  cell  on  a  single 

channel  during  step  k.  Thus,  incoming  backbones  (entering  the  cell)  are 

assigned  (incoming  links)  on  any  single  channel  after  step  k. 


If  the  above  two  Invariants  hold,  then  it  is  easy  to  see  that  after  hmax  steps,  cell  li  will 
have  no  more  than  backbone  links  assigned  to  any  single  channel. 


C  ^  C 

and  no  node  occurs  on  more  than  lihmc 


o{ 


y/a(n)^ 


0(- 


backbones  (from 


(4.15)). 

We  prove  by  induction  that  the  invariants  hold,  as  follows; 

If  Invariant  1  holds  at  the  end  of  step  k  —  1,  then  Invariant  2  continues  to 
hold  after  the  channel-allocation  phase  of  step  k.  If  Invariant  2  holds  after  the 
channel-allocation  phase  of  step  k,  then  Invariant  1  will  continue  to  hold  after 
the  node-allocation  phase  of  step  k,  and  thus  both  Invariants  1  and  2  will  hold 
at  the  end  of  step  k. 

Base  Case:  Before  the  procedure  begins,  at  step  0,  each  node  is  assigned  to  its  own 
backbone,  for  which  it  is  effectively  the  origin  (this  can  also  be  viewed  as  a  single  backbone 
link  incoming  to  this  node  from  an  imaginary  super-source).  Thus,  after  Step  0,  Invariant 
1  holds  trivially.  Invariant  2  is  trivially  true. 

Inductive  Step: 

Suppose  Invariants  1  and  2  held  at  the  end  of  step  k  —  1.  Consider  a  particular  cell  li 
during  step  k. 

Let  the  number  of  proper  channels  in  li  he  c' . 

From  Lemma  32,  >  c  —  ^  for  each  cell.  Each  backbone  Sh(x)  that  enters 

cell  Ti  in  step  k  has  a  previous  hop- node  in  one  of  the  8  adjacent  cells.  Also  note  that,  as 
a  consequence  of  Lemma  32,  each  previous  hop  node  has  at  least  [ of  cell  H’s  proper 
channels  available  to  it  as  choices  for  the  link  that  will  enter  cell  li  (since  it  can  operate  on 
/  channels,  of  which  at  most  can  be  non-proper  in  cell  H). 
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Channel  ii  vertices 


Set  V  C  £ 


Set  C 

One  vertex  for  each 
backbone  entering  cell  D 
in  step  i 


Set  AA(V) 

Channel  vertices 


Channel  is  vertices 


Channel  ic'-i  vertices 


Channel  £/  vertices 


Set  V 

vertices 

for  each  proper  channel 

Figure  4.6:  Bipartite  Graph  for  Cell  7i  in  step  k 

Channel- Allocation  Construct  a  bipartite  graph  with  two  sets  of  vertices  (Fig.  4.6): 
one  set  (call  it  £)  has  a  vertex  corresponding  to  each  of  the  (source)  backbones  that  enter 
the  cell  7i  in  step  k.  From  Lemma  28,  it  follows  that  \C\  <  xhe  other  set  (call  it 

V)  has  vertices  for  each  proper  channel  i  in  cell  H,  i.e.,  \V\  =  . 

A  backbone  vertex  is  connected  to  all  the  vertices  for  the  channels  proper  in  TC  on  which 
the  previous  hop  node  of  that  backbone  can  switch  (and  which  are  therefore  valid  channel 
choices  for  entering  the  cell  TC).  We  show  that  there  exists  a  matching  that  pairs  each 
backbone  vertex  to  a  unique  channel  vertex,  through  an  argument  based  on  Hall’s  marriage 
theorem  (Theorem  31).  Thus,  our  objective  is  to  show  that  for  all  V  T  A  |A(V)|  >  |V|, 
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where  M{V)  C  "P  is  the  union  of  the  neighbor-sets  of  all  vertices  in  V. 
We  first  note  the  following: 


> 


^  5na(n) 
'  4  c 
15/na(n) 

4c 


j  >  ^  ^5na(n)  _ 

3fna{n)  ^  29fna{n) 
1000c  “  8c 


15/na(n)  3/ 

“  ^  T 

(•.•  na{n)  >  250c) 


Consider  the  following  two  cases: 


(4.31) 


Case  1;  |V|  < 

Consider  any  set  V  of  backbone  vertices  such  that  |V|  <  Then,  since  there  are 

at  most  non-proper  channels  in  a  cell,  every  previous  hop  node  has  at  least  ^ 

proper  channel  choices.  For  each  proper  channel  there  are  ^  associated 

channel  vertices.  Using  (4.31),  we  obtain  that:  |AA(V)|  >  |~^]  J  >  29/na(n)  ^  'pj^g^gfQj.g 

|AA(V)|  >  |V|. 


Case  2:  |V|  > 

Consider  sets  V  of  size  at  least  Intuitively,  to  show  that  |AA(V)|  >  |V|  for  all  such 

V,  we  first  show  that  if  a  channel  overload  condition  occurs,  resulting  in  |AA(V)|  <  |V|  for 
some  V,  then  the  overload  must  also  manifest  itself  in  some  channel- aligned  subset  (i.e.,  a 
subset  where  all  incoming  backbones  corresponding  to  subset  vertices  have  some  common 
proper  channel  i  available  to  them).  Thus,  to  show  that  no  overload  condition  occurs,  it 
suffices  to  show  that  no  overload  condition  occurs  in  any  of  these  critical  channel-aligned 
subsets,  which  can  be  shown  using  Lemma  33.  The  argument  is  formalized  as  follows: 

Let  Vi  be  the  set  comprising  all  sets  Ui  C,  such  that  all  backbone  vertices  in  Ui  have 
channel  i  associated  with  them  (i.e.,  all  backbone  vertices  in  Hi  have  i  available  to  them  as 
a  valid  proper  channel  choice  for  entering  7i). 


Claim  (a)  VU  G  |J  Vi  : 

i  proper  in  H. 

If  \u\  >  then  \Af{U)\  >  \C\ 

8c 
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Proof  of  Claim  (a):  By  assumption,  U  ^  Vi  for  some  i  that  is  proper  in  TL.  Also,  since  no 
node  can  be  the  previous  hop  in  step  k  of  more  flows  than  those  assigned  to  it  in  step  k  —  1, 
and  Invariant  1  held  after  step  k  —  1,  therefore  no  previous  hop  node  is  common  to  more 
than  14  backbone  links  entering  7i  in  step  k.  Let  A  be  the  set  of  distinct  previous  hop  nodes 
associated  with  If  \U\  >  then  |^|  >  > 

fr^  +  1  >  (note  that  >  250/  >  500  >  112). 

Therefore,  A  contains  at  least  one  subset  B  satisfying  \B\  =  \ 1  •  Recognizing  that 
all  members  of  A,  and  hence  all  members  of  B,  are  capable  of  switching  on  channel  i,  we 
can  invoke  Lemma  33  on  B,  to  obtain  that  when  /  >  100:  |C-^(B)|  >  [^1-  This  yields: 

Af{U)  >  \Cnm[^^\  >  |C(B)|(^^-l)  >  > 

15na(n)  3  (  na{n)\  -|  5na(n)  \  |  /^l 

8  8  ~  ^  ~  4 

Claim  (b)  Consider  a  set  V  C  £. 


If  |AZ(V)|  <  |V|  then  3  channel  i  proper  in  7i,  and  5*  C  V  such  that: 

c  ^  o  I  c  I  ^  r29/^«(^)i 

Si  G  Vi  and  \Si\  >  - - - 

8c 


(4.32) 


Proof  of  Claim  (b):  Suppose  |AZ(V)|  <  |V|.  Let  us  denote  by  5^  C  V  the  set  of  all 
backbone  vertices  in  V  that  are  associated  with  channel  i  (i.e.,  have  channel  i  available 
as  a  valid  proper  channel  choice  for  entering  cell  Ti).  Consider  the  bipartite  sub-graph 
Gv  induced  by  V  U  AZ(V),  and  assign  all  edges  unit  capacity.  Construct  the  graph  Gy  = 
(V  U  AZ(V)  U  {s,  t},  E)  where  s  is  a  source  node  having  a  unit  capacity  edge  to  all  vertices 
V  £  V,  and  t  is  a  sink  node,  connected  to  each  vertex  u  G  AZ(V)  via  a  unit  capacity  edge 
(thus,  E  comprises  the  edges  in  Gy  and  the  additional  edges  just  described). 

We  try  to  obtain  a  (s,  t)  flow  g  in  Gy  such  that  all  edges  (s,  v)  are  saturated.  Each 
vertex  v  £  V  sub-divides  the  unit  of  flow  received  from  s  equally  amongst  all  edges  {v,  u) 
outgoing  from  it.  Since  each  vertex  has  edges  to  vertices  of  at  least  [ channels,  this  yields 
at  least  \^]  ^  ^ 5na(n)  _  >  29/na(n)  j-ggg  (4.31)).  Thus,  each  V  £  V 

contributes  at  most  29/na(n)  units  of  flow  to  a  vertex  u  £  AZ(V),  i.e.,  g{v,u)  <  29/na(n)  • 
Hence  no  vertex  u  £  AZ(V)  gets  more  than  h{u)  =  g{v,  u)  =  29flia{n)  units  of  flow,  where 

veSi 

i  is  the  channel  corresponding  to  vertex  u.  Resultantly,  if  |5j|  <  l^ZZZ^fRdj  Iq],  ^Yl  channels 
i  that  are  proper  in  cell  Ti,  this  implies  that  h{u)  <  1,  and  setting  g{u,t)  =  h{u)  yields  the 
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desired  (s,  t)  flow.  Hence  is  a  valid  flow  that  allows  a  unit  of  flow  to  pass  through  each 
vertex  u  G  V.  Therefore,  from  the  Integrality  Theorem  (Theorem  32),  we  can  obtain  an 
integer-capacity  flow,  which  yields  a  matching  of  size  |V|.  Therefore,  from  Hall’s  marriage 
theorem  (Theorem  31),  |AA(V)|  >  |V|  (else  a  matching  of  size  |V|  could  not  have  existed). 
This  yields  a  contradiction.  Hence,  there  must  exist  a  proper  channel  i,  and  5*  C  V  such 
that  Si  G  Vi  and  |5i|  >  Since  set-cardinality  must  necessarily  be  an  integer,  it 

follows  that  |5i|  >  ;  and  (4.32)  holds. 

Claim  (c)  VV  C  C  such  that  |V|  >  :  |AA(V)|  >  |V| 

Proof  of  Claim  (c):  Suppose  |AA(V)|  <  |V|.  Then,  from  Claim  (b),  there  exists  a  set 
iSj  C  V  such  that  G  Vj,  and  |5j|  >  ^ _  Thus  Si  qualifies  as  a  set  to  which  Claim 
(a)  applies.  Invoking  Claim  (a)  on  this  set  Si,  it  follows  that  |AA(V)|  >  |AA(5j)|  >  \C\  >  |V|. 
This  yields  a  contradiction.  Thus,  |AA(V)|  >  |V|. 

Taking  both  Case  1  and  Case  2  into  account,  we  have  thus  proved  that  V  V  C  £  : 
|AA(V)|  >  |V|.  Therefore,  from  Hall’s  marriage  theorem  (Theorem  31),  each  backbone 
vertex  can  be  matched  with  a  unique  channel  vertex,  and  the  corresponding  backbone  will 
be  assigned  to  the  channel  with  which  this  vertex  is  associated.  Thus  all  backbones  get 
assigned  a  channel,  and  (since  there  are  c];iannel  vertices  for  each  proper  channel) 

no  more  than  [  J  incoming  backbone  links  are  assigned  to  any  single  channel. 

While  Hall’s  marriage  theorem  proves  that  such  a  matching  exists,  the  matching  itself 
can  be  computed  using  the  Ford-Fulkerson  method  [22]  on  a  flow  network  obtained  from 
the  bipartite  graph  by  adding  a  source  with  an  edge  to  each  vertex  in  £,  a  sink  to  which 
each  vertex  in  V  has  an  edge,  and  assigning  unit  capacity  to  all  edges. 

Thus,  Invariant  2  continues  to  hold  after  the  channel-allocation  phase  of  step  k. 

Node- Allocation  Having  determined  the  channel  each  incoming  backbone  link  should 
use  to  enter  cell  7i,  we  need  to  assign  a  node  in  cell  7i  to  each  backbone.  For  this,  we  again 
construct  a  bipartite  graph.  In  this  graph,  the  first  set  of  vertices  (call  it  IF)  comprise  a 
vertex  for  each  backbone  link  entering  cell  TC  in  step  k.  The  second  set  (call  it  TZ)  comprises 
14  vertices  for  each  backbone  candidate  node  in  cell  TL.  A  vertex  x  in  .£  has  an  edge  with 
a  vertex  y  in  7^  iff  the  actual  backbone  candidate  node  associated  with  y  is  capable  of 
switching  on  the  channel  assigned  to  the  backbone-link  associated  with  vertex  x  in  the 
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preceding  channel-allocation  phase  (this  implies  that  y  is  indeed  a  valid  relay  choice  for  the 
backbone  link  corersponding  to  x). 

Each  vertex  x  G  J-  has  degree  at  least  lAMu,  since  it  is  assigned  to  a  proper  channel, 
which  by  definition  has  at  least  representatives  in  cell  7i,  each  of  which  has  14  associated 
vertices  in  TZ.  Also  recall  that  Once  again  we  seek  to  show  that  for  all 

VC^,  |AA(V)|  >  |V|. 


Consider  any  set  V  G  T . 

Since  no  channel  is  assigned  more  than  [  J  entering  backbone  links  during  the 
channel-allocation  phase  of  this  step,  the  vertices  in  V  are  cumulatively  associated  with 


at  least  m  > 


|V| 


.  distinct  proper  charrrrele.  Since  each  of  fheee  channels  has  at  leas. 

Mu  backbone  candidate  nodes  capable  of  switching  on  them,  and  any  one  node  can  only 
switch  on  up  to  /  proper  channels,  this  implies  that  the  number  of  distinct  nodes  in  cell 
Ti  cumulatively  associated  with  these  m  >  proper  channels  is  at  least  > 

1  C  1  f  1  C  1 

|V|r9/"°(")-|  gly, 

— ^  ^  1^-  Since  each  backbone  candidate  node  has  14  vertices  in  TZ,  it  follows  that 

|V(v)i>i4(m)>im>|v|, 

Then  invoking  Hall’s  Marriage  Theorem  again,  each  vertex  x  G  T  can  be  matched 
with  a  unique  vertex  y  G  TZ,  and  the  actual  network  node  associated  with  y  is  deemed  the 
backbone  representative  for  the  backbone  corresponding  to  vertex  x  in  cell  7i  (the  matching 
can  again  be  computed  via  the  Ford-Fulkerson  method).  Since  there  are  at  most  14  vertices 
associated  with  a  node,  no  node  is  assigned  more  than  14  incoming  backbone  links  in  step 
k,  and  Invariant  1  continues  to  hold  after  the  node-allocation  phase  of  step  k. 

This  proves  that  both  Invariants  1  and  2  continue  to  hold  after  step  k. 


It  follows  that,  after  step  hmax  (where  hmax  < 


y/a{n) 


I,  each  cell  7i  has  0(- 


Ena(n) ' 


o( 

o( 


c 

J_ 

/^n) 


entering  backbone  links  per  channel,  and  each  node  appears  on  0{hr^ 


0( 


(from  (4.15))  source  backbones. 


□ 


Detour  backbones:  We  can  construct  the  SPD'  stretch  of  backbone  segment  Sh{x)  for 
the  detour-routed  flows  in  any  manner  possible,  i.e.,  by  assigning  links  to  any  eligible 
node/channel  (at  least  one  eligible  node  is  known  to  exist  since,  as  a  consequence  of  Lemma 
32,  each  node  can  switch  on  at  least  channels  that  are  proper  in  the  next  cell). 

Additional  last  hop:  Now  let  us  account  for  the  possible  additional  last  hop  that  some 
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flows  may  have,  yielding  an  additional  cell  in  Sh{x)  (in  addition  to  those  traversed  from 
source  x  to  pseudo-destination).  We  can  extend  the  backbones  over  the  additional  hop  in 
any  feasible  manner  (and  as  argued  for  the  detour  backbones,  it  is  indeed  feasible  to  do  so). 

Construction  of  'Db{x)  Note  that  by  our  routing  strategy  a  flow  will  only  attempt  to 
transition  to  the  destination  backbone  when  it  enters  ready-for-transition  phase. 

From  Lemma  36,  the  total  number  of  flows-routes  traversing  a  cell  in  ready-for-transition 
phase  is  O(log^n)  (counting  possible  repeat  traversals),  which  is  asymptotically  dominated 

by  O(kVp), 

Therefore,  for  each  node  x,  and  for  each  flow  i  for  which  x  is  the  destination:  we  can 
construct  V^^\x)  by  using  any  feasible  nodes/channels  (it  is  always  feasible  to  construct 
T)^\x)  as  each  node  can  switch  on  at  least  \^~\  channels  that  are  proper  in  the  next  cell 
to  be  traversed). 

4.7.2  Load  Balance  within  a  Cell 

Now  we  show  that  no  channel  or  interface  bottlenecks  form  in  the  network  when  our  de¬ 
scribed  construction  is  used.  As  in  Section  4.6,  we  use  the  following  terminology:  A  flow-link 
is  said  to  enter  a  cell  on  a  channel  j  if  the  flow’s  route  includes  a  hop  (link) 
where  Uj_i  is  in  a  cell  adjacent  to  7i,  Vi  is  in  ,  and  Vi-i  transmits  the  flow’s  packets 
to  Vi  using  channel  j  (this  naturally  implies  that  both  Vi-i  and  Vi  can  operate  on  channel 
j).  Similarly,  a  flow-link  is  said  to  leave  a  cell  7i  on  channel  j  if  the  route  includes  a  link 
(uj,  Uj+i),  where  Vi  is  in  7i,  Uj+i  is  in  a  cell  adjacent  to  7i,  and  Vi  transmits  the  flow’s  packets 
to  Ui+i  using  channel  j. 


Per-Channel  Load 

Lemma  38.  The  number  of  flow-links  that  enter  any  cell  on  a  given  channel  is  (9(!lVTMj 

w.h.p. 

Proof  A  flow-route  traversing  7ii,7i2,  may  enter  a  cell  Tij  on  a  channel  i 

under  the  following  circumstances: 


1.  The  flow  is  either  in  progress-on-source-backbone  phase,  or  it  is  in  the  ready-for- 
transition  phase,  but  is  yet  to  make  a  transition  to  the  destination  backbone,  and 
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i  is  the  channel  assigned  to  the  source  backbone  link  between  the  backbone  nodes  in 
Tij-i  and  7ij 

2.  The  flow  has  already  made  a  transition,  and  i  is  the  channel  assigned  to  the  link 
between  the  destination  backbone  nodes  in  Tij-i  and  7ij 


We  first  consider  the  flow-links  that  enter  a  cell  in  progress-on-source-backbone  phase, 
i.e.,  they  are  proceeding  on  their  respective  source  backbone  segments.  Recall  that  these 
are  all  non-detour-routed  flows,  since  detour-routed  flows  are  always  in  ready-for-transition 
phase.  The  number  of  such  flows  that  enter  any  cell  on  a  single  channel  is 
(Lemma  37). 

We  now  need  to  account  for  the  fact  that  some  of  the  flow-links  may  enter  the  cell  in  the 
ready-for-transition  phase.  From  Lemma  36  there  are  0(log^  n)  flow-routes  traversing  any 
cell  in  ready-for-transition  phase  w.h.p.  (recall  that  these  include  the  detour-routed  flows 
with  their  repeat  traversals  counted  separately,  and  also  the  possible  additional  last  D'D 
hop  for  all  flows).  Thus,  regardless  of  whether  they  are  still  on  their  source  backbone,  or 
have  already  made  the  transition  to  their  destination  backbone,  the  number  of  such  entering 
flow- links  assigned  to  any  single  channel  is  O(log^n). 

Hence  the  number  of  flow- links  entering  on  a  single  channel  is  Q(^^^^"^)-|-Q(log^  n) 


o{ 


w.h.p.  for  each  cell  of  the  network. 


□ 


Lemma  39.  The  number  of  flow-links  that  leave  any  cell  on  any  single  channel  is  Of 
w.h.p. 


Proof.  Note  that  the  flow-links  that  leave  the  cell  must  then  enter  one  of  the  8  adjacent 
cells  on  that  channel  (as  a  backbone  link  for  a  flow  leaves  the  current  cell,  and  enters  an 
adjacent  cell).  Hence,  flow-links  leaving  the  cell  on  a  channel  can  be  no  more  than  8  times 
the  maximum  number  of  flow-links  entering  a  cell  on  any  one  channel,  which  has  been 
established  as  in  Lemma  38.  Therefore,  the  total  number  of  flows  leaving  any 

given  cell  on  a  given  channel  is  also  w.h.p.  □ 


Lemma  40.  The  number  of  additional  transition  links  scheduled  on  any  single  channel 
within  any  cell  is  O(log^n)  w.h.p. 
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Figure  4.7:  Two  additional  transition  links  for  a  flow  lying  wholly  within  the  cell 

Proof.  Recall  the  transition  strategy  outlined  in  the  proof  of  Lemma  34,  whereby  the  flow 
locates  a  cell  along  the  route  where  the  source  backbone  node  qx,  and  destination  backbone 
node  Qy  are  connected  through  a  third  node  z.  This  yields  two  additional  links  Qx  ^  z,  and 
z  ^  Qy  that  lie  entirely  within  the  cell  (Fig.  4.7).  Note  that  the  number  of  flows  performing 
this  transition  in  the  cell  can  be  no  more  than  the  number  of  flows  traversing  the  cell  in 
ready-for-transition  phase.  From  Lemma  36  there  are  O(log^n)  such  flows  traversing  any 
cell  w.h.p.  In  the  worst  case,  we  can  count  2  additional  links  for  each  such  flow  as  being  all 
assigned  to  one  channel.  The  result  follows  from  this  observation.  □ 

Per-Node  Load 

Lemma  41.  The  number  of  flow-links  that  are  assigned  to  any  one  node  in  any  cell  is 
\.  )  w.h.p. 

Proof.  A  node  is  always  assigned  an  outgoing  flow-link  for  the  single  flow  for  which  it  is 
the  source.  A  node  is  also  assigned  an  incoming  flow-link  for  each  flow  for  which  it  is  the 
destination,  and  from  Lemma  1  there  are  O(logn)  such  flows  for  any  node  w.h.p.  Besides,  a 
node  may  be  assigned  a  pair  of  flow-links  (incoming  and  outgoing)  for  flows  that  are  in  the 
ready-to-transition  phase,  for  which  it  facilitates  a  transition  (if  it  is  a  transition  facilitator) 
node) ,  or  on  whose  source  or  destination  backbone  it  occurs  (if  it  is  a  backbone  candidate) . 
There  are  0(log^  n)  such  flow- links  (counting  repeat  traversals  by  the  same  flow,  additional 
last  hop,  and  additional  transition  links  separately)  in  a  cell  w.h.p.  (Lemma  36  and  Lemma 
40).  Thus,  a  node  can  only  have  O(log^n)  such  flow-links  assigned. 

We  now  consider  the  flows  in  progress-on-source-backbone  phase  that  do  not  originate 
in  the  cell.  Note  that  these  must  be  non-detour-routed  flows  in  their  SD'  stretch.  These 
flows  are  on  their  source-backbone,  and  from  Lemma  37,  each  backbone  candidate  node  has 
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n  ^  flow-links  assigned.  Corresponding  to  each  such  incoming  link,  there 

is  an  outgoing  link  (since  the  node  is  a  relay  for  these  flows).  Thus,  the  total  number  of 
such  assigned  flow- links  is 

Therefore,  the  number  of  flow- links  assigned  to  any  single  node  is  1  -|-  O(logn)  -|- 
0(log^n)  +  0(^^^^)  ^  □ 

4.7.3  Transmission  Schedule 

Similar  to  adjacent  (c,  /)  assignment,  and  the  sub-optimal  lower  bound  construction  of 
Section  4.6,  we  can  obtain  a  two-level  feasible  transmission  schedule.  Since,  each  cell  can 
face  interference  from  at  most  a  constant  number  7  of  nearby  cells,  the  resultant  cell- 
interference  graph  (a  graph  with  a  vertex  for  each  cell,  and  an  edge  between  two  vertices 
if  the  corresponding  cells  can  interfere  with  each  other),  has  a  chromatic  number  at  most 
1  -|-  7.  Hence,  we  can  come  up  with  a  global  schedule  having  1  -|-  7  unit  time  slots  in  each 
round.  In  any  slot,  if  a  cell  is  active,  then  all  interfering  cells  are  inactive. 

For  intra-cell  scheduling,  we  construct  a  conflict  graph  based  on  the  nodes  in  the  active 
cell,  and  its  adjacent  cells  (note  that  the  hop-sender  of  each  flow  shall  lie  in  the  active  cell, 
and  the  hop-receiver  shall  lie  in  one  of  the  adjacent  cells,  except  for  transition  links,  for 
which  both  lie  in  the  active  cell),  as  follows: 

We  create  a  separate  vertex  for  each  flow-link  for  which  a  node  in  the  cell  needs  to 
transmit  data  (repeat  traversals  by  the  same  flow’s  route  or  additional  transition  links 
lying  wholly  within  the  cell  are  counted  as  distinct  flow-links  for  the  purpose  of  scheduling; 
these  have  been  accounted  for  while  bounding  the  number  of  flow-links  in  a  cell  in  previous 
lemmas).  Since  each  flow-link  has  an  assigned  channel  on  which  it  operates,  each  vertex 
in  the  graph  has  an  implicit  associated  channel.  Besides,  each  vertex  has  an  associated 
pair  of  nodes  corresponding  to  the  hop  endpoints.  Two  vertices  are  connected  by  an  edge 
if  (1)  they  have  the  same  associated  channel,  or  (2)  at  least  one  of  their  associated  nodes 
is  the  same.  The  scheduling  problem  thus  reduces  to  obtaining  a  vertex-coloring  of  this 
graph.  If  we  have  a  vertex  coloring,  then  it  ensures  that  (1)  a  node  is  never  simultaneously 
sending/receiving  for  more  than  one  flow  (2)  no  two  flow-links  on  the  same  channel  are 
active  simultaneously.  Thus,  the  number  of  neighbors  of  a  graph  vertex  is  upper  bounded 
by  the  number  of  flow-links  requiring  a  transmission  in  the  active  cell  on  that  channel,  and 
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the  number  of  flow-links  assigned  to  the  flow’s  two  hop  endpoints  (both  hop-sender  and  hop- 
receiver).  It  can  be  seen  from  Lemma  39,  Lemma  40  and  Lemma  41  that  the  degree  of  the 
conflict  graph  is  0(^^)  +  0(^^)  +  0(log4n)  +  0(^^)  +  0(^^)  =  0(^^) 
(note  that  O(log^n)  since  we  showed  in  (4.14)  that  =  f7(y/j^^)). 

Thus  the  graph  can  be  colored  in  colors.  Hence,  the  cell-slot  (which  can  be 

/  n  log  n 

assumed  to  be  of  unit  time)  is  divided  into  — )  =  0{  ^  )  equal  length  subslots, 

and  all  the  flow-links  get  a  slot  for  transmission.  This  implies  that  each  flow-link  gets  a 
fraction  of  the  slot-time.  Moreover,  each  cell  gets  at  least  one  slot  in  1  -|-  7 
slots,  where  7  is  a  constant,  and  each  channel  has  bandwidth  Thus,  the  throughput 
each  flow  can  get  is  L!  (^) 

Theorem  7.  When  c  =  O(logn)  and  100  <  f  <  c,  construction  CR2  yields  a  per- flow 
throughput  of  Q{W ^ niogn )  random  {c,  f)  assignment. 

We  now  describe  the  construction  CR*. 


Construction  CR* 


When  f  <  100.’  Use  construction  CRi  described  in  Section  4.6,  which  achieves  a 
per-flow  throughput  of  Q.{W y^niogn)  (Theorem  6).  From  Lemma  14,  it  follows  that 


=  L1(4=).  Thus,  for  /  <  100,  =  fl(l). 

/  Vrnd  ■'  /  Prnd  '  ' 

Y  n  log  n  y  n  log  n 


•  When  f  >  100.'  Use  construction  CR2,  which  achieves  a  per-flow  throughput  of 
whenever  /  >  100  (Theorem  7). 


This  yields  the  following  result: 


Theorem  8.  When  c  =  O(logn)  and  2  <  f  <  c,  construction  CR*  yields  a  per-flow 
throughput  of  il{W )  • 

Combining  Theorem  8  with  the  upper  bound  on  capacity  proved  in  Section  4.5,  we 
obtain  the  following  theorem: 

Theorem  9.  When  c  =  O(logn)  and  2  <  f  <  c,  the  per-flow  network  capacity  with  random 
(c,/)  assignment  is 
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Communication  Probability  with  Constrained  Switching 


Figure  4.8:  Comparison  of  probability  of  sharing  a  channel 


4.8  Discussion 


We  have  shown  that  the  capacity  for  random  (c, /)  assignment  is  Q{W in  the 
regime  c  =  O(logn).  It  is  easy  to  see  that: 


Prnd  —  1 


1  - 


/ 


C-/  +  1 


note  that  the  product  in  the  R.H.S.  above  is  uniformly  0  whenever  /  >  c  —  /  +  1,  as  one  of  the  terms  in  the  product  is  0 


>  1  - 


/ 


f 

>  1  -  e"^ 

(4.33) 

Therefore  /  =  pmd  =  To  illustrate,  setting  /  =  ^/c  yields  pmd  > 

1  —  ^  >  g.  In  light  of  (4.33),  our  result  implies  that  /  =  ^{y/c)  suffices  for  achieving 
capacity  of  the  same  order  as  the  unconstrained  switching  case  [65,  66]. 

We  also  described  a  simpler  construction  that  achieves  per- flow  throughput  Q.{W cnLgn)' 
For  /  =  y/c,  using  this  simpler  construction  would  yield  a  capacity  degradation  by  a  factor 
of  the  order  of  C4  compared  to  the  unconstrained  switching  case. 

Fig.  4.8  is  a  numerical  plot  (obtained  by  setting  c  to  10^,  and  varying  /  from  2  to  c) 
depicting  how  the  probability  pmd  compares  with  the  probability  p^^^  =  min{ 
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Recall  that  Pmd  is  the  probability  that  two  nodes  share  at  least  one  channel  in  random  (c,  /) 
assignment,  and  is  the  upper  bound  on  the  probability  that  two  nodes  share  at  least 
one  channel  in  adjacent  (c, /)  assignment  (Chapter  3).  It  must  be  remarked  that  though 
both  models  allow  nodes  to  switch  between  a  subset  of  /  channels,  the  additional  degrees 
of  freedom  obtained  via  the  random  assignment  model  lead  to  a  much  quicker  convergence 
of  Prnd  toward  1 . 

It  is  to  be  noted  that  the  optimal  construction  is  substantially  more  complex  than  the 
simpler  construction  and  requires  that  all  routes  be  constructed  in  lock-step.  Thus  the 
two  constructions  represent  an  interesting  trade-off  in  capacity  versus  scheduling/routing 
complexity. 

Moreover,  the  optimal  construction  provides  many  useful  insights  into  the  implications 
of  heterogeneous  interfaces  for  routing  in  a  realistic  scale  network.  Note  that  the  need  for 
a  synchronized  route  construction  procedure  arose  from  a  strong  coupling  between  choices 
of  channels/relays  at  each  hop,  over  and  above  what  one  would  hnd  in  a  network  with 
homogeneous  interfaces. 

Let  us  re-examine  the  implications  of  heterogeneous  interfaces  that  are  subject  to  switch¬ 
ing  constraints:  if  we  have  to  choose  a  route  for  a  flow,  then  the  first  hop  transmission  must 
necessarily  be  scheduled  on  one  of  the  /  channels  that  the  source  can  switch  on  (since  the 
source  will  be  sending  it);  the  first  relay  node  must  also  be  one  that  has  at  least  one  chan¬ 
nel  in  common  with  the  source  node  (so  that  it  can  receive  the  transmission);  moreover 
if  channel  x  is  chosen,  then  the  relay  node  must  be  capable  of  switching  on  channel  x. 
Similarly,  the  choice  of  channel  at  each  subsequent  hop  is  limited  to  the  channel-subset  of 
the  hop-sender,  and  the  choice  of  next  relay  is  limited  to  nodes  that  can  switch  on  such  a 
channel.  Thus  the  choice  of  relay  at  hop  i  determines  the  channel  choices  and  consequently 
relay  choices  available  for  hop  i  -|-  1.  This  leads  to  a  coupling  across  hops  of  the  same  route. 
Moreover,  this  also  leads  to  a  strong  coupling  across  routes.  It  is  due  to  these  concerns 
that  the  capacity  achieving  construction  has  a  synchronized  route  selection  procedure.  We 
present  a  simple  example  to  illustrate  this  issue; 

Consider  nodes  A,  B,  C,  D,  X,  Y,  each  of  which  is  equipped  with  a  single  interface.  Con¬ 
sider  two  flows  A  ^  B  and  C  ^  D.  A,B  and  C,  D  are  not  neighbors,  but  the  nodes  X,  Y 
are  neighbors  of  all  nodes  A,  B,  C,  D,  and  can  thus  act  as  relays  for  the  flows.  The  channel- 
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Figure  4.9:  Example  illustrating  coupling  between  routes 


1  2 

sets  of  the  nodes  are  as  shown  in  Fig.  4.9.  The  first  flow  can  use  the  route  A^X^B  or 

3  4  3  4 

A-^Y^B.  The  second  flow  has  only  one  choice  C^Y^D.  Suppose  we  perform  route- 

selection  for  the  two  flows  sequentially  in  the  order  A  ^  B,C  ^  D.  If  the  first  flow  chooses 

its  route  without  consideration  of  the  second  flow  and  its  constraints,  it  may  end  up  choos- 

3  4  3  4 

ing  A-^Y^B.  Since  the  second  flow  must  necessarily  choose  C^Y^D,  this  will  lead  to  a 

1  2 

bottleneck.  The  optimal  choice  is  for  the  first  flow  to  use  route  A^X^B  and  for  the  second 
flow  to  use  C^Y^D.  If  all  interfaces  could  switch  on  all  channels,  this  problem  would  not 
have  arisen,  as  regardless  of  which  route  the  first  flow  chose,  the  second  flow  could  always 
choose  the  node-disjoint  route,  and  use  different  channels  on  that  route.  Thus,  interfaces 
with  constrained  switching  ability  require  more  sophisticated  routing  algorithms  to  reduce 
the  chances  of  severe  bottleneck  formation  due  to  a  sub-optimal  routing  choice. 
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Chapter  5 

Scheduling  in  Multi-Channel 
Wireless  Networks 


In  this  chapter,  we  examine  scheduling  issues  in  multi-channel  wireless  networks,  where 
channels  may  have  heterogeneous  rate  characteristics.  We  also  briefly  discuss  the  scheduling 
implications  of  interface  heterogeneity.  Appropriate  scheduling  policies  are  of  utmost  im¬ 
portance  in  achieving  good  throughput  characteristics  in  a  multi-hop  wireless  network.  The 
seminal  work  of  Tassiulas  and  Ephremides  yielded  a  throughput- optimal  scheduler,  which  is 
capable  of  scheduling  all  “feasible”  traffic  flows  while  maintaining  stability  of  queues  [110]. 
However,  such  an  optimal  scheduler  is  difficult  to  implement  in  practice.  Consequently,  var¬ 
ious  imperfect  scheduling  strategies,  which  trade-off  throughput  for  simplicity,  have  been 
proposed  ([75,  119,  120,  103]  amongst  others). 

When  multiple  orthogonal  channels  are  available  in  a  wireless  network,  it  is  possible  to 
get  substantial  performance  improvement  (compared  to  the  use  of  just  one  of  these  chan¬ 
nels)  by  harnessing  the  spectral  resource  to  the  maximum  extent  possible.  However,  this 
also  gives  rise  to  non-trivial  channel  coordination  issues.  The  situation  is  exacerbated  by 
variability  in  the  achievable  data-rates  across  different  channels  on  a  link.  Such  variability 
may  arise  due  to  various  reasons,  such  as  the  use  of  different  modulations,  different  propa¬ 
gation  characteristics,  or  time-varying  channel  conditions.  In  this  chapter,  our  focus  is  on 
heterogeneity  in  channel  rates  which  is  time-invariant. 

Computing  an  optimal  schedule,  even  in  a  single-channel  network,  is  usually  intractable 
both  due  to  need  for  global  information,  and  computational  complexity.  However,  imper¬ 
fect  schedulers  requiring  limited  local  information  can  typically  be  designed,  which  provide 
acceptable  worst-case  (and  typically  much  better  average  case)  performance  degradation 
compared  to  the  optimal.  In  a  multi-channel  network,  the  local  information  exchange 
required  by  even  an  imperfect  scheduler  can  be  quite  prohibitive,  as  information  may  be 
needed  on  a  per-channel  basis.  For  instance,  Lin  and  Rasool  [74]  have  described  a  scheduling 
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algorithm  for  multi-channel  multi-radio  wireless  networks  that  requires  information  about 
per-channel  queues  at  all  interfering  links.  This  provides  a  strong  motivation  for  the  study 
of  scheduling  algorithms  that  can  operate  with  limited  information,  while  still  providing 
acceptable  worst-case  performance  guarantees. 

In  this  chapter,  we  examine  the  scheduling  implications  of  multiple  channels,  and  het¬ 
erogeneity  in  channel-rates.  We  begin  by  briefly  discussing  related  work  in  Section  5.1.  We 
introduce  the  model,  definitions  and  notation  in  Section  5.2.  Scheduling  issues  that  arise 
in  multi-channel  wireless  networks  are  discussed  in  Section  5.3.  Section  5.4  presents  a  brief 
summary  of  our  results.  We  present  a  result  on  the  cardinality  of  the  set  of  links  sched¬ 
uled  by  any  maximal  scheduler  in  Section  5.5.  In  Section  5.6,  we  derive  a  lower  bound  on 
performance  of  a  greedy  maximal  scheduler,  which  improves  upon  existing  bounds  for  this 
scheduler.  In  Section  5.7,  we  describe  a  scheduler  that  operates  with  limited  information, 
and  prove  a  lower  bound  on  its  performance.  In  Section  5.8,  we  briefly  discuss  the  issue  of 
scheduling  with  heterogeneous  radios,  and  in  Section  5.9  we  identify  interesting  directions 
for  future  work. 

5.1  Related  Work 

The  issue  of  throughput-optimal  scheduling  was  considered  in  the  seminal  work  of  Tasiulas 
and  Ephremides  [110],  in  which  they  described  the  Dynamic  Backpressure  Scheduler,  which 
is  throughput-optimal.  The  impact  of  imperfect  scheduling  on  the  convergence  of  joint 
rate-control  and  scheduling  was  examined  in  [75]. 

A  maximal  scheduler  combined  with  local  threshold  based  participation  rule  has  been 
proposed  in  [121].  The  efficiency  ratio  of  the  greedy  maximal  scheduler  has  been  studied  in 
[25,  49,  50,  48],  amongst  others.  It  was  shown  in  [25]  that  for  a  class  of  graphs,  with  conflicts 
amongst  adjacent  links,  greedy  maximal  matching  yields  an  efficiency-ratio  of  1.  These 
topologies  are  those  which  satisfy  a  certain  property  termed  the  local  pooling  condition.  In 
[49],  this  was  generalized  to  a-local  pooling  (a  <  1),  and  it  was  shown  that  the  greedy 
maximal  matching  algorithm  achieves  an  efficiency-ratio  of  cr  in  all  topologies  where  the 
local  pooling  factor  is  a.  This  result  was  further  generalized  to  general  interference  models 
in  [50]. 

A  queue-loading  algorithm  to  be  used  with  a  maximal  scheduler  in  a  multi-channel  multi- 
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radio  networks  has  been  described  in  [74],  Cross-layer  resource  allocation  in  multi-channel 
wireless  networks  has  been  considered  in  [81]. 

5.2  Preliminaries 

We  consider  a  multi-hop  wireless  network.  For  simplicity,  we  largely  limit  our  discussion  to 
nodes  equipped  with  a  single  radio-interface  capable  of  tuning  to  any  one  available  channel 
at  any  given  time.  All  interfaces  in  the  network  have  identical  operational  capabilities,  and 
may  switch  between  the  available  channels  if  desired,  i.e.,  there  are  no  switching  constraints. 
Many  of  the  presented  results  can  also  be  used  to  obtain  results  for  the  case  when  each 
node  is  equipped  with  multiple  interfaces;  we  briefly  discuss  this  issue. 

The  wireless  network  is  viewed  as  a  directed  graph,  with  each  directed  link  in  the 
graph  representing  an  available  communication  link.  We  model  interference  using  a  conflict 
relation  between  links.  Two  links  are  said  to  conflict  with  each  other  if  it  is  only  feasible 
to  schedule  one  of  the  links  on  a  certain  channel  at  any  given  time.  The  conflict  relation 
is  assumed  to  be  symmetric.  The  conflict-based  interference  model  provides  a  tractable 
approximation  of  reality  -  while  it  does  not  capture  the  wireless  channel  precisely,  it  is  more 
amenable  to  analysis.  Such  conflict-based  interference  models  have  been  used  frequently  in 
the  past  work  (e.g.,  [121,  74]). 

Time  is  assumed  to  be  slotted,  with  the  slot  duration  being  1  unit  time  (i.e.,  we  use  slot 
duration  as  the  time  unit).  In  each  time  slot,  the  scheduler  used  in  the  network  determines 
which  links  should  transmit  in  that  time  slots,  as  well  as  the  channel  to  be  used  for  each 
such  transmission. 

We  now  introduce  some  notation  and  terminology. 

The  network  is  viewed  as  a  collection  of  directed  links,  where  each  link  is  a  pair  of  nodes 
that  are  capable  of  direct  communication  with  non-zero  rate. 

•  C  denote  the  set  of  directed  links  in  the  network. 

•  C  is  the  set  of  all  available  orthogonal  channels.  Thus,  jCj  is  the  number  of  available 
channels. 

•  We  say  that  a  scheduler  schedules  link-channel  pair  (/,c)  if  it  schedules  link  I  for 
transmission  on  channel  c. 
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•  rf  denotes  the  rate  achievable  on  link  I  by  operating  link  I  on  channel  c,  provided 

that  no  conflicting  link  is  also  scheduled  on  channel  c.  For  simplicity,  we  assume  that 

rf  >  0  for  alH  G  £  and  c  G  C  The  rates  rf  do  not  vary  with  time.  We  also  define 

the  following  terms:  r^ax  =  max  rf,  and  rmin  =  min  ^f-  When  two  conflicting 

l&C,c&C  ‘  l&C,c&C  ‘ 

links  are  scheduled  simultaneously  on  the  same  channel,  both  achieve  rate  0. 


•  f3s  denotes  the  self-skew-ratio,  defined  as  the  minimum  ratio  between  rates  supportable 
over  different  channels  on  a  single  link.  Therefore,  for  any  two  channels  c  and  d,  and 
any  link  I,  we  have  -f  >  0s-  Note  that  0  <  <  1. 


•  0c  denotes  the  cross-skew-ratio,  defined  as  the  minimum  ratio  between  rates  support¬ 
able  over  the  same  channel  on  different  links.  Therefore,  for  any  channel  c,  and  any 
two  links  I  and  f:  >  0c-  Note  that  0  <  0c  <  1- 


Let  r;  =  max  rf.  Let  fj,  =  min  — .  Note  that  cj,  >  1  +  dofcr,  —  1).  Moreover, 
typically  as  will  be  much  larger  than  this  worst-case  bound,  as  is  largest  when  0a  =  \, 
in  which  case  ag  =  \C\. 


•  b{l)  and  e{l),  respectively,  denotes  the  nodes  at  the  two  endpoints  of  a  link.  In 
particular,  link  I  is  directed  from  node  b{l)  to  node  e{l). 


•  T(6(/))and  T(e(^))denote  the  set  of  links  incident  on  nodes  b{l)  and  e{l),  respectively. 
Thus,  the  links  in  £{h{l))  and  £{e{l))  share  an  endpoint  with  link  1.  Since  we  focus  on 
single- interface  nodes,  this  implies  that  if  link  I  is  scheduled  in  a  certain  time  slot,  no 
other  link  in  £{b{l))  or  £{e{l))  can  be  scheduled  at  the  same  time.  We  refer  to  this  as 
an  interface  conflict.  Let  A{1)  =  £{b{l))  'J£{e{l)).  Note  that  I  G  A{1).  Links  in  A{1) 
are  said  to  be  adjacent  to  link  1.  Links  that  have  an  interface  conflict  with  link  I  are 
those  that  belong  to  £{b{l))  D£{e{l))  \  {/}.  Let  A^ax  =  max|M(Z)|. 

•  I(Z)  denotes  the  set  of  links  that  conflict  with  link  I  when  scheduled  on  the  same 
channel.  I(/)  may  include  links  that  also  have  an  interface-conflict  with  link  1.  By 
convention,  I  is  considered  included  in  I(/).  The  subset  of  I(/)  comprising  interfering 

^Though  we  assume  that  rf  >  0  for  all  I,  c,  the  results  can  be  generalized  very  easily  to  handle  the  case 
where  rf  =  Q  for  some  link-channel  pairs 
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links  that  are  not  adjacent  to  I  is  denoted  by  he.,  I'(/)  =  I(Z)  \  A{1).  Let 

Imax  —  niax|I  (/)|. 

•  Ki  denotes  the  maximum  number  of  non-adjacent  links  in  I'(/)  that  can  be  scheduled 
on  a  given  channel  simultaneously  if  I  is  not  scheduled  on  that  channel.  Ki{\C\) 
denotes  the  maximum  number  of  non-adjacent  links  in  I'(^)  that  can  be  scheduled 
simultaneously  on  any  of  the  \C\  channels  (without  conflicts)  if  I  is  not  scheduled  for 
transmission.  Note  that  here  we  exclude  links  that  have  an  interface  conflict  with  1. 

•  K  IS  the  largest  value  of  Ki  over  all  links  I,  i.e.,  K  =  max  Ki.  K^c\  is  the  largest  value 
of  Ki{\C\)  over  all  links  I,  i.e.,  iL|c|  =  max  Ki{\C\).  Let  Imax  =  max  |L(/)|-  L  is  not 
hard  to  see  that  for  single-interface  nodes: 

K  <  K\c\  <  ram{K\C\,  Imax}  (5.1) 

We  remark  that  the  term  K  as  used  by  us  is  similar,  but  not  exactly  the  same  as 
the  term  K  used  in  [74].  In  [74],  K  denotes  the  largest  number  of  links  that  may  be 
scheduled  simultaneously  if  some  link  I  is  not  scheduled,  including  links  adjacent  to  1. 
We  exclude  the  adjacent  links  in  our  definition  of  K.  Throughout  this  text,  we  will 
refer  to  the  quantity  defined  in  [74]  as  k  instead  of  K. 

•  Let  7;  be  0  if  there  are  no  other  links  adjacent  to  I  at  either  endpoint  of  1  if  there 
are  other  adjacent  links  at  only  one  endpoint,  and  2  if  there  are  other  adjacent  links 
at  both  endpoints. 

•  7  is  the  largest  value  of  7/  over  all  links  I,  i.e.,  7  =  max  7;. 

•  Load  vector:  We  consider  single-hop  traffic,  i.e.,  any  traffic  that  originates  at  a  node  is 
destined  for  a  next-hop  node,  and  is  transmitted  over  the  link  between  the  two  nodes. 
Under  this  assumption,  all  the  traffic  that  must  traverse  a  given  link  can  be  treated 
as  a  single  flow. 

The  traffic  arrival  process  for  link  I  is  denoted  by  {A(t)}.  The  arrivals  in  each  slot  t 
are  assumed  i.i.d.  with  average  A;.  The  average  load  on  the  network  is  denoted  by 
load  vector  A  =  [Ai,  A2, ...,  A|£|],  where  Xi  denotes  the  arrival  rate  for  the  flow  on  link 
1.  Xi  may  possibly  be  0  for  some  links  1. 
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•  Queues:  The  packets  generated  by  each  flow  are  first  added  to  a  queue  maintained  at 
the  source  node  (depending  on  the  algorithm,  there  could  be  a  single  queue  for  each 
link,  or  a  queue  for  each  (link,  channel)  pair). 

•  Stability:  The  system  of  queues  in  the  network  is  said  to  be  stable  if,  for  all  queues  Q 
in  the  network,  the  following  is  true: 

lim  sup  -  >  E[q{T)]  <  oo 

^7^1  (5.2) 

where  q{T)  denotes  the  backlog  in  queue  Q  at  time  r 

•  Feasible  load  vector:  In  each  time  slot,  the  scheduler  used  in  the  network  determines 
which  links  should  transmit  and  on  which  channel  (recall  that  each  link  is  a  directed 
link,  with  a  transmitter  and  a  receiver).  In  different  time  slots,  the  scheduler  may 
schedule  a  different  set  of  links  for  transmission.  A  load  vector  is  said  to  be  feasible,  if 
there  exists  a  scheduler  that  can  schedule  transmissions  to  achieve  stability  (as  defined 
above),  when  using  that  load  vector. 

•  Link  rate  vector:  Depending  on  the  schedule  chosen  in  a  given  slot  by  the  scheduler, 
each  link  I  will  have  a  certain  transmission  rate.  For  instance,  using  our  notation 
above,  if  link  I  is  scheduled  to  transmit  on  channel  c,  it  will  have  rate  rf  (we  assume 
that,  if  the  scheduler  schedules  link  I  on  channel  c,  it  does  not  schedule  another 
conflicting  link  on  that  channel).  Thus,  the  schedule  chosen  for  a  time-slot  yields  a  link 
rate  vector  for  that  time  slot.  Note  that  link  rate  vector  specifies  rate  of  transmission 
used  on  each  link  in  a  certain  time  slot.  On  the  other  hand,  load  vector  specifies  the 
rate  at  which  traffic  is  generated  for  each  link. 

•  Feasible  rate  region:  The  set  of  all  feasible  load  vectors  constitutes  the  feasible  rate- 
region  of  the  network,  and  is  denoted  by  A.  A  throughput- optimal  scheduler  is  one 
that  is  capable  of  maintaining  stable  queues  for  any  load  vector  A  G  A. 

•  Throughput- optimal  scheduler:  From  the  work  of  [110],  it  is  known  that  a  sched¬ 
uler  that  maintains  a  queue  for  each  link  I,  and  then  chooses  the  schedule  given  by 
argmax^  Yfi  qiri,  is  throughput-optimal  for  scenarios  with  single-hop  traffic  {qi  is  the 
backlog  in  link  Vs  queue,  and  the  maximum  is  taken  over  all  possible  link  rate  vectors 
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T^).  Note  that  qi  is  a  function  of  time,  and  queue-backlogs  at  the  start  of  a  time  slot 
are  used  above  for  computing  the  schedule  (or  link-rate  vector)  for  that  slot. 

•  Imperfect  scheduler.  It  is  usually  difficult  to  determine  the  throughput-optimal  link- 

rate  allocations,  since  the  problem  is  typically  computationally  intractable.  Hence, 
there  has  been  significant  recent  interest  in  imperfect  scheduling  policies  that  can  be 
implemented  efficiently.  In  [75],  cross-layer  rate-control  was  studied  for  an  imperfect 
scheduler  that  chooses  (in  each  time  slot)  link-rate  vector  ~s  such  that  ^ 

5  argmaxy  for  some  constant  5  (0  <  d  <  1). 

It  was  shown  [75]  that  any  scheduler  with  this  property  can  stabilize  any  load-vector 
A  G  dA  -  note  that  if  a  rate  vector  A  is  in  A,  then  the  rate  vector  d  A  is  in  5K.  5A  is 
also  referred  to  as  the  5-reduced  rate-region.  If  a  scheduler  can  stabilize  all  A  G  dA, 
its  efficiency-ratio  is  said  to  be  5. 

•  Maximal  scheduler.  Under  our  assumed  interference  model,  a  schedule  is  said  to  be 
maximal  if  (a)  no  two  links  in  the  schedule  conflict  with  each  other,  and  (b)  it  is  not 
possible  to  add  any  link  to  the  schedule  without  creating  a  conflict  (either  conflict 
due  to  interference,  or  an  interface-conflict). 

We  utilize  the  following  stability  criterion  (from  [85])  based  on  Lyapunov  drift: 

Let  U^°‘\t)  =  {U^^\t))  be  the  backlog  matrix,  where  u\°'\t)  is  the  backlog  in  queue  i 
for  commodity  a.  Let  L{U)  be  a  non-negative  function  of  U . 

Lemma  42.  (Lyapunov  Stability)  [85]  If  the  Lyapunov  function  of  unfinished  work  L{U) 
satisfies: 

E[L(u{t  +  1))  -  L{UmU{t)]  <  B  - 

i,a 

for  some  positive  constants  B,6Y\  then: 

(  M-l  S. 

hmsup  E[ul^\kT)]  <  B  (5.3) 

M— >O0  •  j  \ 

2, a  k=0  ) 

Furthermore,  if  there  is  a  nonzero  probability  that  the  system  will  eventually  empty,  then 
a  steady  state  distribution  for  unfinished  work  exists,  with  bounded  average  occupancies  U[ 
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satisfying 


(5.4) 


Y^efu1<B 

2, a 

We  remark  that,  though  the  definition  of  stability  used  in  [85]  is  different  from  the 
dehnition  we  use  (our  assumed  definition  conforms  to  Strong  Stability  [37]),  the  proof  of 
Lemma  42  in  [85]  establishes  stability  in  the  sense  of  the  alternative  definition  by  establishing 
the  condition  (5.3),  which  is  equivalent  to  Strong  Stability.  Therefore,  Lemma  42  can  be 
used  for  the  purpose  of  our  results. 


5.3  Scheduling  in  Multi-channel  Wireless  Networks 

As  was  discussed  previously,  throughput-optimal  scheduling  is  often  an  intractable  problem 
even  in  a  single-channel  network.  However,  imperfect  schedulers  that  achieve  a  fraction  of 
the  stability-region  can  potentially  be  implemented  in  a  reasonably  efficient  manner.  Of 
particular  interest  is  the  class  of  imperfect  schedulers  know  as  maximal  schedulers,  which  we 
dehned  in  Section  5.2.  The  performance  of  maximal  schedulers  under  various  assumptions 
has  been  studied  in  much  recent  work,  e.g.,  [120,  103],  with  the  focus  largely  on  single¬ 
channel  wireless  networks.  The  issue  of  designing  a  distributed  scheduler  that  approximates 
a  maximal  scheduler  has  been  addressed  in  [51],  etc. 

When  there  are  multiple  channels,  but  each  node  has  one  or  few  interfaces,  an  addi¬ 
tional  degree  of  complexity  is  added,  in  terms  of  channel  selection.  In  particular,  when  the 
link-channel  rates  rf  can  be  different  for  different  links  I,  and  channels  c,  the  scheduling 
complexity  is  exacerbated  by  the  fact  that  it  is  not  enough  to  assign  different  channels  to 
interfering  links;  for  good  performance,  the  channels  must  be  assigned  taking  achievable 
rates  into  account,  i.e.,  individual  channel  identities  are  important. 

Scheduling  in  multi-channel  multi-radio  networks  has  been  examined  in  [74].  In  [74], 
it  was  argued  that  if  a  simple  maximal  scheduler  is  used  in  such  a  network,  there  could 
possibly  be  an  arbitrary  degradation  in  efficiency-ratio  (assuming  arbitrary  variability  in 
rates)  compared  to  the  efficiency-ratio  of  a  maximal  scheduler  with  identical  channels.  A 
queue-loading  algorithm  was  been  proposed,  in  conjunction  with  which,  a  maximal  scheduler 
can  stabilize  any  vector  in  A,  for  arbitrary  jdc  and  fdg  values.  This  rule  requires 

knowledge  of  of  the  length  of  queues  at  all  interfering  links. 
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Ps 

Figure  5.1:  2-D  visualization  of  channel  heterogeneity 

Variability  in  channel  gains  over  different  links  is  very  much  a  characteristic  of  real-world 
wireless  networks,  and  must  indeed  be  handled  by  protocols  and  algorithms.  However,  if  the 
solutions  require  extensive  information-exchange,  the  resultant  performance  improvement 
may  be  offset  by  the  increased  overhead.  In  light  of  this,  it  is  crucial  to  consider  various 
points  of  trade-off  between  information  and  performance.  In  this  context,  the  quantities 
(3s,Pc  and  ag  defined  in  Section  5.2  prove  to  be  useful.  The  quantities  Ps  and  Pc  can  be 
viewed  as  two  orthogonal  axes  for  worst-case  channel  heterogeneity  (Fig.  5.1).  The  quantity 
as  provides  an  aggregate  (and  thus  averaged-out)  view  of  heterogeneity  along  the  Ps  axis. 
/3s  =  1  corresponds  to  a  scenario  where  all  channels  have  identical  characteristics,  such  as 
bandwidth,  modulation/transmission-rate,  noise- levels,  etc.,  and  the  link-gain  is  a  function 
solely  of  the  separation  between  sender  and  receiver.  /3c  =  1  corresponds  to  a  scenario  where 
all  links  have  the  same  sender- receiver  separation,  and  the  same  conditions/characteristics 
for  any  given  channel,  but  the  channels  may  have  different  characteristics,  e.g.,  an  802.11b 
channel  with  a  maximum  supported  data-rate  of  11  Mbps,  and  an  802.11a  channel  with  a 
maximum  supported  data-rate  of  54  Mbps. 

In  this  chapter,  we  show  that  in  a  single-interface  network,  a  simple  maximal  scheduler 
augmented  with  local  traffic-distribution  and  threshold  rules  achieves  an  efficiency-ratio  at 
least  7}|C| )  •  noteworthy  features  of  this  result  are; 

1.  This  scheduler  does  not  require  information  about  queues  at  interfering  links. 

2.  The  performance  degradation  (compared  to  the  scheduler  of  [74])  when  rates  are 

variable,  i.e.,  Ps,Pc  /  I,  is  not  arbitrary,  and  is  at  worst  ^  >  |^-  Thus, 
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Channel  Interference  conflict 


Vertex  representing  a  link 


Figure  5.2;  Example  of  improved  bound  on  efficiency  ratio:  link-interference  topology  is  a 
star  with  a  center  link  and  x  radial  links 

even  with  a  purely  local  information  based  queue-loading  rule,  it  is  possible  to  avoid 
arbitrary  performance  degradation  even  in  the  worst  case.  Typically,  the  performance 
would  be  much  better. 

3.  In  many  network  scenarios,  the  provable  lower  bound  of  ^X|c|+max{i  7}|C|)  actu¬ 
ally  be  better  than  This  is  particularly  likely  to  happen  in  networks  with  single¬ 
interface  nodes,  e.g.,  suppose  we  have  three  channels  a,  b,  c  with  rf  =  1,  rf  =  l,rf  =  0.5 
for  all  links  1.  Then,  in  the  network  in  Fig.  5.2  (where  the  link-interference  graph 
is  a  star  with  x  radial  vertices,  and  there  are  no  interface-conflicts),  =  x,7  = 
0,  as  =  2.5,  and  we  obtain  a  bound  of  whereas  the  proved  lower  bound  of  the 

scheduler  of  [74]  is 

The  multi-channel  scheduling  problem  is  further  complicated  if  the  rates  rf  are  time- 
varying,  i.e.,  Vi  =  rf{t).  However,  handling  such  time-varying  rates  is  beyond  the  scope 
of  the  results  in  this  chapter,  and  we  address  only  the  case  where  rates  do  not  exhibit 
time- variation. 
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5.4  Summary  of  Results 


For  multi-channel  wireless  networks  with  single-interface  nodes,  we  present  lower  bounds 
on  the  efficiency-ratio  of  a  class  of  maximal  schedulers  (including  both  centralized  and 
distributed  schedulers),  which  indicate  that  the  worst-case  efficiency-ratio  can  be  higher 
when  there  are  multiple  channels  (as  compared  to  the  single-channel  case).  More  specifically, 
we  show  that: 


•  The  number  of  links  scheduled  by  any  maximal  scheduler  are  within  at  least  a  6 
fraction  of  the  maximum  number  of  links  activated  by  any  feasible  schedule,  where: 


5  =  max 


|C| 


K\c\  +  max{l,  7}|C|  ’  max{l,  K  +  j} 


•  A  centralized  greedy  maximal  (CGM)  scheduler  achieves  an  efficiency-ratio  at  least 

max{  7}|C|  ’  maTjTTTFTF ^  constitutes  an  improvement  over  the  lower 

bound  for  the  CGM  scheduler  proved  in  [74],  Since  <  vii\n{K\C\,  Imax}  < 

this  new  bound  on  efficiency-ratio  can  often  be  substantially  tighter. 

•  We  show  that  any  maximal  scheduler,  in  conjunction  with  a  simple  local  queue-loading 

rule,  and  a  threshold-based  link-participation  rule,  achieves  an  efficiency-ratio  of  at 
least  ^  7}|C| )  ■  This  scheduler  is  of  significant  interest  as  it  does  not  require 

information  about  queues  at  all  interfering  links. 

Note  that  the  text  below  makes  the  natural  assumption  that  two  links  that  conflict  with 
each  other  (due  to  interference  or  interface-conflict)  are  not  scheduled  in  the  same  timeslot 
by  any  scheduler  discussed  in  the  rest  of  this  chapter. 


5.5  Maximal  Schedulers 

We  begin  by  proving  a  result  about  the  cardinality  of  the  set  of  links  scheduled  by  any 
maximal  scheduler. 

Theorem  10.  Let  Sopt  denote  the  set  of  links  scheduled  by  a  scheduler  that  seeks  to  max¬ 
imize  the  number  of  links  scheduled  for  transmission,  and  let  Smax  denote  the  set  of  links 
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activated  by  any  maximal  scheduler.  Then  the  following  is  true: 


15, 


'max 


>  max{ 


|C| 


K\c\  + 


max{l,7}|C|  ’  max{l,i^  + 


(5.5) 


Proof.  Denote  by  c'^{l')  the  channel  on  which  a  link  I'  is  scheduled  in  Smax- 

Consider  I  £  Sopt  H  Smax-  Since  I  was  not  scheduled  by  the  maximal  scheduler,  this 
implies  that  at  least  one  of  the  following  events  must  be  true; 


1.  Condition  1:  Smax  C  Sopt  C  A{1)  /  f. 

2.  Condition  2:  For  each  channel  c  G  C,  there  exists  some  link  l'^  G  Smax  C  I'(0,  such 
that  c'^{l'o)  =  c. 


Now,  define  sets  Aif  and  Ain  as  follows: 

Aif  =  {I  :  I  £  Sopt  n  Smax  and  Condition  1  holds} 
in  —  {^opt  ^max  )  \  A/ 

Thus  Ai f  comprises  the  set  of  links  in  Sopt  C  Smax  that  have  an  interface  conflict  with 
some  link  in  the  maximal-schedule,  while  Ain  comprises  the  set  of  links  in  Sopt  PSmax  that 
are  blocked  in  the  maximal-schedule  purely  by  channel-interference  conflicts. 

For  each  I  £  Ain,  let  Tz  =  Smax  n  I'(Z).  Taking  note  of  Condition  2,  each  link  I  £  Ain 
must  be  blocked  on  each  channel  c  G  C  by  at  least  one  link  in  T/-  Any  link  h  £  Smax  can 
occur  in  the  Tz  of  at  most  non-adjacent  links  I  £  Sopt- 

Therefore,  it  follows  that: 


|C||An|  <  K\c\  \Smax\  (5.6) 

Any  interface-conflicts  experienced  by  links  in  Sopt  C  Smax  must  necessarily  be  caused 
by  links  in  SmaxPSopt-  Since  a  link  can  only  block  up  to  7  links  through  interface-conflicts, 
we  obtain  that: 

<  7  \Smax  P  Sopt\  (6T) 
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Thus  we  obtain  the  following: 


I  ‘Sopt  I  I  SjYiax  hi  Sopt  I  T  I  Sopt  h  SjTiax  |  I  Smax  h  Sgpt  |  T  |  I  T  I 


\S„ 


\Sn 


|5„ 


K, 


\Smax  h  Sopt\  T  'jl'Smax  h  Sopt\  T  ^  1^5] 


< 


< 


1  ^max  1 

l^max 

^  ^opt\ 

“t“  \SfYiax  ^  ^opt\ 

+  (7  - 

1)  |*5-n7ax 

1  Smax  1 

l^max 

1  +  (7- 

-1)|5 

max  ^  ^opt 

,  5ci 
+  |C| 

\^max  1 

1 

^max  1 

l^max 

-|-  max{0, 7 

1  +  ^71 
|C| 

from  (5.7)  &  (5.6) 


^opt  I 


Ku 


(5.8) 


15, 


=  1  +  max{0, 7  —  1}  + 


max 

Ki 


|C| 


|C| 


=  max{l,  7}  + 


|C| 


|C| 


Furthermore,  consider  any  link  I  in  Smax-  Either  I  is  scheduled  even  in  Sopt,  or  if  I  is 
not  scheduled  in  Sopt,  at  most  K  links  in  V{1),  and  7  links  in  A{1)  \  {/}  could  have  been 


scheduled  in  Sopt-  Thus: 


|5opi  I 
I  Smax  I 


<  max{l,  K  +  ^} 


(5.9) 


Combining  (5.8)  and  (5.9),  we  obtain  that: 

|C| 


Smnxl  >  max 


K\c\  +  max{l,7}|C|  ’  max{l,i^  +  7} 


15 


opt\ 


(5.10) 


□ 


5.6  Centralized  Greedy  Maximal  Scheduler 

A  centralized  greedy  maximal  (CGM)  scheduler  operates  in  the  manner  described  below. 
In  each  timeslot: 

1.  Calculate  link  weights  wf  =  qirf  for  all  links  I  and  channels  c. 

2.  Sort  the  link-channel  pairs  {l,c)  in  non-increasing  order  of  wf. 

3.  Add  the  first  link-channel  pair  in  the  sorted  list  (i.e.,  the  one  with  highest  weight)  to 
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the  schedule  for  the  timeslot,  and  remove  from  the  list  all  link-channel  pairs  that  are 
no  longer  feasible  (due  to  either  interface  or  interference  conflicts). 

4.  Repeat  step  3  until  the  list  is  exhausted  (i.e.,  no  more  links  can  be  added  to  the 
schedule) . 

In  [74],  it  was  shown  that  this  centralized  greedy  maximal  (CGM)  scheduler  can  achieve 
an  approximation-ratio  at  least  in  a  multi-channel  multi-radio  network,  where  k 

is  the  maximum  number  of  links  conflicting  with  a  link  I  that  may  possibly  be  scheduled 
concurrently  when  I  is  not  scheduled.  This  bound  holds  for  arbitrary  values  of  f3s  and  f3c, 
and  variable  number  of  interfaces  per  node. 

However,  this  bound  can  be  quite  loose  in  multi-channel  wireless  networks  where  each 
device  has  one  or  few  interfaces. 

In  this  section,  we  prove  an  improved  bound  on  the  efficiency-ratio  achievable  with  the 
CGM  scheduler  for  single-interface  nodes.  We  also  briefly  discuss  how  it  can  be  used  to 
obtain  a  bound  for  multi-interface  nodes. 

Theorem  11.  LetSopt  denote  the  set  of  links  activated  by  an  optimal  scheduler  that  chooses 
a  set  of  link-channel  pairs  {l,c)  for  transmission  such  that  is  maximized.  Let  c*{l) 

denote  the  channel  assigned  to  link  I  G  Sopt  by  this  optimal  scheduler. 

Let  Sg  denote  the  set  of  links  activated  by  the  centralized  greedy  maximal  ( CGM)  sched¬ 
uler,  and  let  c^{l)  denote  the  channel  assigned  to  a  link  I  £  Sg. 

Then: 


Ec®(0  ^ 

Wi'-’  > 

l£Sg 


max 


_ ^ _  _ 1 _ 

-|-  max{l,  7}|C|  ’  max{l,  -|-  7} 


(5.11) 


Proof.  We  denote  by  c*{l)  the  channel  on  which  I  G  Sopt  is  activated  by  the  optimal  sched¬ 
uler.  c^{l)  is  the  channel  on  which  I  £  Sg  \s  activated  by  the  CGM  scheduler.  If  a  link  I  is 
not  in  Sopt  or  Sg,  then,  as  a  matter  of  notational  convention,  it  can  be  said  that  c*{l)  =  T 
or  c^{l)  =  T  respectively,  where  T  denoted  “undehned”. 

Consider  I  £  Sopt  G  Sg.  Therefore,  I  was  not  scheduled  by  the  CGM  scheduler.  This 
implies  that  during  some  step  k  of  the  execution  of  the  CGM  algorithm.  Vs  status  changed 
from  schedulable  to  unschedulable.  This  could  happen  for  one  of  two  reasons;  (1)  in 
step  k,  some  link  I'  incident  on  one  of  Ls  endpoints  was  selected  by  the  CGM  scheduler. 
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thereby  making  I  unschedulable  due  to  an  interface-conflict  (2)  in  step  k,  all  c  channels 
became  infeasible  for  I  to  be  scheduled,  implying  that  for  all  c  G  C,  some  link  I'  G  I'(Z)  was 
scheduled  on  c  by  the  scheduler  by  the  end  of  step  k. 

By  the  definition  of  the  CGM  scheduler,  a  link  I'  would  be  preferentially  selected  for 
scheduling  over  I  (while  I  was  still  schedulable)  only  if  the  resultant  weight  contribution 
Wi,  ^  equals  or  exceeds  the  best  weight  that  could  be  achieved  by  scheduling  I  on  some 
still  feasible  channel.  Thus,  at  least  one  of  the  following  two  conditions  must  be  true; 

1.  Condition  1:  There  exists  a  link  I'  G  SgnSopt(~^-A{l)  such  that  ^  >  wf  for  at  least 
one  channel  c  G  C. 


2.  Condition  2:  For  each  channel  c  G  C,  there  exists  some  link  £  Sg  D  V{1)  such  that 
wf,  >  wf. 

Now,  define  sets  Aif  and  Ain  as  follows: 

Aif  =  {I  :  I  £  Sopt  n  Sg  and  Condition  1  holds}. 

Ain  — 

Let  Sh,m  =  {I  -.I  £  Sgr\  Sopt,  >  wf 

Let  Sb^s  =  {I  -.I  £Sgr\  Sopt,  <  wf 

Then  Sb^m  and  Sb^s  constitute  a  partition  of  Sg  n  Sopt- 

Define  two  subsets  of  Aif  as  follows: 


=  {I  ■  I  £  Aif,c*{l)  was  not  available  to  I  when  /’s  first  interface 
got  used  up  during  CGM  scheduling} 

•^i/,2  =  {I  ■  I  £  Aif,c*{l)  was  still  available  to  I  when  Vs  first  interface 

got  used  up  during  CGM  scheduling} 

From  the  centralized  greedy  nature  of  the  scheduler,  if  a  link  V  £  F(/)  was  scheduled 
on  some  c  G  C  in  5^  while  I  was  still  schedulable  on  some  subset  of  channels  'D  <£  C,  this 
implies  that  wf  >  wf  for  all  d  £  V. 

It  is  true  that  at  the  time  when  I  £  Sb^s  was  assigned  c®(Z),  all  other  c  G  C  with  rf  >  rf 
were  already  assigned  to  some  other  I'  £  V{1),  with  wf,  ^  =  wf  >  wf.  Therefore,  if  T>f 
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is  the  set  of  channels  on  which  I  was  still  schedulable  when  I  was  chosen  for  scheduling  on 
c^{l),  then:  \f  d  G  Vf  :  rf  <  and  \Vf  \  <  |C|  —  1  since  c*{l)  ^  Vf. 

Therefore  for  each  I  G  Si,,s' 


E  EdTT 

cec\©f  i'ei'd) 

cS{l')=c 


^  -  E  ^  E^?  -  (1^1  -  M 


C3{1) 


ceC 


ceC 


(5.12) 


Let  Biil)  =  G  {Sg  n  T(0),  c3{l')  G  C  \  Vf}. 

(  \ 


■•■E 


E  E 


w 


cHn 


,  cec\©f  I’ei'ii) 

V  cS(/')=c  / 


-  E  E^«""  (1^1  "1)  E 

l^Sf)  s  cGC  l^Sfj  s 


c»(0 


(5.13) 


c9(r) 


■■■E  E  '“f 


We  now  consider  links  I  G  Aif. 


> 


E  E”?  -  (ici  - 1)  E 


w 


l^Sb^s  cGC 


Let  us  denote  hy  f{l)  the  link  V  in  Sgr\Sopt  that  is  the  cause  of  blocking  the  first  interface 
of  link  I  G  Aif,  i.e.,  f{l)  is  the  link  that  first  caused  I  to  experience  an  interface-conflict. 
We  first  consider  links  I  G  Aif^i'. 


It  is  true  that  if  f{l)  =  1'  G  A{1)  n  {Sg  n  Sopt)  was  assigned  a  channel  c^{l')  in  Sg  n  Sopt 
while  I  G  Aif^i  was  still  schedulable  on  some  subset  of  channels  Vi  <G  C  \  {c*(f)}  then 
wf,  ^  >  wf  for  all  d  G  Vi,  and  \Vi\  <  |C|  —  1  since  c*{l)  ^  Vi  (note  that  c*  ^  Vi  by  the 
definition  oi  Aif ^i). 


Let  B  =  Y^ 


w 


cHfd)) 
f{i)  • 


l&Aif^l 


Furthermore,  at  least  one  link  I'  G  L(^)  was  scheduled  on  each  c  G  C\Vi,  and  for  each 
such  c,  it  is  evident  that  wf,  ^  =  wf,  >  wf  (since  channels  in  C  \  were  no  longer 
feasible  for  I  at  the  time  its  first  interface  got  used  up).  This  yields: 


cec\©i  z'6i'(0  cec  deVi  cec 

C9(Z')=C 
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Resultantly; 


/  \ 


E 


E  E 


w 


ca{V) 


c£C\Vi 

\  coin=c  j 


sE  E™!  -  OT  - 

l&Aif^lc&C 


Let  82(1)  =  {l'\l'  G  {Sg  n  l'{l)),c3{l')  G  C  \  Vi}. 


(5.15) 


E  f  E  ^  E  E^?  -  (1^1  -  1)^  (5-16) 

l&Aif^i  \l'&B2{l)  /  leAif^iceC 


We  next  consider  links  I  G  Aif^2' 

From  the  definition  of  ^j/,2,  for  each  link  I  G  ^j/,2)  some  link  /(/)  =  I'  adjacent  to  I 
was  scheduled  in  Sg  n  Sopt  at  a  time  when  I  was  still  schedulable  on  c*{l).  This  implies 
that  Wp  ^  Let  E  =  (recall  the  definition  of  f{l)  for  links  I  G  Aif). 


Thus  we  obtain: 


l&Aif^2 


^*  +  E 


w. 


‘(0 


<  <  7 


E_ 

l^SgC\Sopt 


w, 


HI) 


Ec*(0  ^  c9(0  „ 

wi''>  <-i  2^  -B 


(5.17) 


We  now  consider  links  I  G  Am- 

From  the  definition  of  Ain,  h  follows  that  for  each  c  G  C,  there  is  at  least  one  V  G  T(/) 
scheduled  on  c  such  that  Wp  ^  =  Wp  >  Given  I  G  Ain,  let  •63(0  —  1^(0-  Then: 


E  (  E  £  E  E“f 

l^Ain  \l'£B3(l)  J  l&AinC&C 

Also  note  that  for  any  link  I'  G  Sg,  at  most  links  in  IV  can  be  scheduled  in  Sopt- 
Thus,  any  link  I'  G  Sg  figures  in  Bi{l)  or  82(1)  or  82,{l)  of  at  most  K\c\  links  I  G  Sopt- 
In  light  of  this  observation,  the  definition  of  ag,  and  using  (5.13),  (5.16)  and  (5.18): 

E  E^?  “  (1^1  “  E  +  E  12^1  -  - 1)^  +  E  12^1  -  ^\c\'22'^f^'^^ 

/GtSf,  s  cGC  s  cGC  ^^^9 

(5.19) 
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Rearranging  and  noting  that  '^^wf  >  OgW 


c*(0. 


cec 


o-.s 


w 


c*{l) 


+  E  d'"’  <  AVi  E'"?'"’  +  (lei  -DIE  -f'"  +  B 


cHi) 


■■■  E^-i^’  +  E 

This  yields  the  following: 


c*(0 


+  ^  'I  +  I  ^  „f' '>  +  B 


c9(l) 


leAi. 


IgSq 


k  ^GtS^  . 


(5.20) 
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.f(0  ^ 

l^Sopt  l^SgOSopt  l^Sopt(~^Sg 
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^G<Sq 
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C3{1) 
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E 

^G<Sq 


cs(0 
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c*(i) 


lGSb,r> 


leA. 


ifA 


leA. 


if, 2 


l^Ai. 


E^/ 


cs(l) 


ies„ 


E-pn  E-.^"'+E  -r'^  +  E 


u; 


^■^(0 


l^Sb  , 


i  iGSb,! 


leA. 


if,l 


leAi. 


E 

isA 


w. 


*(0 


T.2 


C9(i) 


^G<Sq 


/ 


< 


E^i 

les. 


C9(l) 


E 

^G<Sb  rn, 


%iE-r"  + (1^1-1)  E-r"+^ 

/GtSg  y^GtSb,s 


EC®(0  D 
/GtS^nSopt 


V 


from  (5.20),  (5.17) 

/ 


< 


1 


cS(/) 


iGS„ 


,(7  ^  <«-il) 


E 


^(0 


^GtSgDtSopt 

CTs 


V 


i^ic, +  (|C|  - 1)  -  E  -  E 

ies„  \ieSg  ieSb,,n  iGS.nSopt 
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< 


E 

l^Sq 


|C| 


ici7  E 


(l)  I  iGSgOSopt 


^1  + 


V 


lGSb,r} 


i^ic, + (ici  - 1)  I -  E  -  E_-r  + B 

leSly,m  iGSgDSopt 


I^Sq 


< 


noting  that  7  E]  —  B  >  0 

l^SgOSopt 

(m-i)  Y.  E  »r''>+A|ciE“- 

l^Sb,m  ^^Sb,m  l^Sg 


C^l) 


E^i 

l^Sq 


cS(l) 


7  E 


w 


C^{1) 


l^Sgr\Sopt 


V 


+ 


dci-DlE-r"- E-r"-  E_-r"+^+7 

iGSgClSopt  l^SgCiSopt 


( 


< 


ificiE-r'  +  (|ci-i)  E-E'  +  (7-i)  e 


c^(0 


l^Sq 


i  l^Sq 


iGSgHSopt 


(TsE^ 


C9(/) 


/G5„ 


(5.21) 


< 


-f^lCI  +  (|C|  -  1)(1  +  niax{0,7  -  1})  +  max{l,7} 


Kid  +max{l,7}|C| 


Ec‘ 
Wl 

\l^Sb,t 


Ec^ 
l^SgHSopt 


l&Sg 


C9(i) 


(5.22) 


Thus  >  Kic\+nmx{i  7}|C|  E/  /3s  =  1,  this  reduces  to  a  ratio  of 

l^Sopt 

\C\ 

t^|c|+max{l,7}|C|  ' 
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We  now  prove  another  bound  by  showing  that; 


> 


max{  1 ,  K  +  7} 


(5.23) 


This  is  obtained  via  an  argument  very  similar  to  that  used  in  [74]  to  prove  a  bound  of 
for  the  CGM  scheduler,  except  that  we  refine  the  analysis  based  on  a  more  precise 
characterization  of  the  interference  topology: 

Consider  any  link  I  in  Sopt-  Either  I  is  scheduled  on  c*{l)  even  in  Sg,  or  if  I  is  not 
scheduled  on  c*{l),  then  either  (1)  some  link  /'  G  I'(Z)  must  be  scheduled  on  c*{l)  in  Sg  (i.e., 
such  that  ^  >  Wi  or  (2)  some  link  I'  G  A{1)  \  {/}  must  be  scheduled 

on  some  channel  c^{l')  such  that  Wp  ^  >  10^  However,  any  link  I'  G  Sg  can  only  have 
pure  interference  conflict  with  at  most  K  links  that  were  scheduled  in  Sopt  on  that  channel, 
and  interface  conflict  with  at  most  7  links  in  A{1)  H  Sopt-  Thus: 


<  max{l,  77  +  7} 


(5.24) 


Combining  (5.21)  and  (5.24)  yields  the  result. 


Theorem  11  leads  to  the  following  result: 


Theorem  12.  The  centralized  greedy  maximal  (CGM)  scheduler  can  stabilize  the  6-reduced 
rate-region,  where: 


5  =  max 


+  max{l,  7}|C|  ’  max{l,  77  +  7} 


Proof.  We  earlier  discussed  a  result  from  [75]  that  any  scheduler,  which  chooses  rate- 
allocation  ~s  such  that  '^qisi  >  6  argmax  '^qiri,  can  stabilize  the  (5-reduced  rate-region. 
Using  Theorem  11  and  this  result,  we  obtain  the  above  result.  □ 


We  remark  that  the  above  bound  is  independent  of  Pc- 
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5.6.1  Extension  to  Mnltiple  Interfaces  per  Node 

We  now  describe  how  the  result  can  be  extended  to  networks  where  each  node  may  have 
more  than  one  interface. 

Given  the  original  network  node-graph  G  =  {V,E),  construct  the  following  transformed 
graph  G'  =  {V',E'): 

For  each  node  u  G  F,  if  u  has  interfaces,  create  nodes  vi,V2,  in  V' . 

For  each  edge  {u,v)  G  E,  where  u,v  have  mu,mv  interfaces  respectively,  create  edges 
{ui,Vj),l  <  i  <  mu,l  <  j  <  my,  and  set  q[ui,vj)  =  Q{u,v)-  Set  the  achievable  channel 
rate  appropriately  for  each  edge  in  E'  and  each  channel.  For  example,  assuming  that  the 
channel-rate  is  solely  a  function  of  u,  v  and  c,  then:  for  each  channel  c,  set  rf  „  . 

The  transformed  graph  G'  comprises  only  single- interface  links,  and  thus  Theorem  11 
applies  to  it.  Moreover,  it  is  not  hard  to  see  that  a  schedule  that  maximizes  ^  qiri  in  G' 
also  maximizes  'Y^qiri  in  G.  Thus,  the  efficiency-ratio  from  Theorem  11  for  network  graph 
G'  yields  an  efficiency-ratio  for  the  performance  of  the  CGM  scheduler  in  the  multi-interface 
network. 

We  briefly  touch  upon  how  one  would  expect  the  ratio  to  vary  as  the  number  of  interfaces 
at  each  node  increases.  Note  that  the  efficiency-ratio  depends  on  jdg,  Of  these 

(3s  and  \C\  are  always  the  same  for  both  G  and  Gh  7  is  also  always  the  same  for  any  G' 
derived  from  a  given  node-graph  G,  as  it  depends  only  on  the  number  of  other  node-links 
incident  on  either  endpoint  of  a  node-link  in  G  (which  is  a  property  of  the  node  topology, 
and  not  the  number  of  interfaces  each  node  has).  However,  might  potentially  increase 
in  G'  as  there  are  many  more  non- adjacent  interfering  links  when  each  interface  is  viewed 
as  a  distinct  node.  Thus,  for  a  given  number  of  channels  \C\,  one  would  expect  the  provable 
efficiency-ratio  to  initially  decrease  as  we  add  more  interfaces,  and  then  become  static. 

While  this  may  initially  seem  counter-intuitive,  this  is  explained  by  the  observation 
that  multiple  orthogonal  channels  yielded  a  better  efficiency-ratio  in  the  single-interface 
case  since  there  was  more  spectral  resource,  but  limited  hardware  (interfaces)  to  utilize 
it.  Thus,  the  additional  channels  could  be  effectively  used  to  alleviate  the  impact  of  sub- 
optimal  scheduling.  When  the  hardware  is  commensurate  with  the  number  of  channels,  the 
situation  (compared  to  an  optimal  scheduler)  increasingly  starts  to  resemble  a  single-channel 
single-interface  network. 


no 


5.6.2  The  Special  Case  of  \C\  Interfaces  per  Node 

Let  us  consider  the  special  case  where  each  node  in  the  network  has  \C\  interfaces,  and 
achievable  rate  on  a  link  between  nodes  u,  v  and  all  channels  c  G  C  is  solely  a  function  of 
u,v  and  c  (and  not  of  the  interfaces  used).  In  this  case,  it  is  possible  to  obtain  a  simpler 
transformation.  Given  the  original  network  node-graph  G  =  {V,E),  construct  \C\  copies 
of  this  graph,  viz.,  G'i,G2,  ...,G|c|,  and  view  each  node  in  each  graph  as  having  a  single¬ 
interface,  and  each  network  as  having  access  to  a  single  channel.  Then  each  network  graph 
Gi  can  be  viewed  in  isolation,  and  the  throughput  obtained  in  the  original  graph  is  the  sum 
of  the  throughputs  in  each  graph.  From  Theorem  11,  in  each  graph  we  can  show  that  the 
CGM  scheduler  is  within  =  min{l,  7^^}  of  the  optimal.  Thus,  even  in  the 

overall  network,  the  CGM  scheduler  is  within  min{l,  of  the  optimal. 


5.7  A  Rate-Proportional  Maximal  Multi-Channel 
(RPMMC)  Scheduler 

In  this  section,  we  describe  a  scheduler  where  a  link  does  not  require  any  information  about 
queue-lengths  at  interfering  links. 

The  set  of  all  links  in  denoted  by  C.  The  arrival  process  for  link  I  is  i.i.d.  over  all 
time-slots  t,  and  is  denoted  by  {A/(t)},  with  E[Xi{t)]  =  A/.  We  make  no  assumption  about 
independence  of  arrival  processes  for  two  links  I,  k.  However,  we  consider  only  the  class  of 
arrival  processes  for  which  E[Xi(t)Xk{t)]  is  bounded,  i.e.,  i?[Aj(t)Afc(t)]  <  rj  for  all  I  £  C,  k  G 
C,  where  rj  is  a  suitable  constant. 

Consider  the  following  scheduler; 


Rate- Proportional  Maximal  Multi-Channel  (RPMMC)  Scheduler 


Each  link  maintains  a  queue  for  each  channel.  The  length  of  the  queue  for  link  I  and 
channel  c  at  time  t  is  denoted  by  In  time-slot  t:  only  those  link-channel  pairs  with 

qf{t)  >  rf  participate,  and  the  scheduler  computes  a  maximal  schedule  from  amongst  the 
participating  links.  The  new  arrivals  during  this  slot,  i.e.,  Xi{t)  are  assigned  to  channel- 


queues  in  proportion  to  the  rates,  i.e.,  Xf{t)  = 


E'f 

b£C 
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Theorem  13.  The  RPMMC  scheduler  stabilizes  the  queues  in  the  network  for  any  load- 
vector  within  the  5 -reduced  rate-region,  where: 


K\c\  +max{l,7}|C| 

Proof.  The  proof  of  stability  is  based  on  a  Lyapunov  drift  argument.  We  present  a  proof- 
sketch  here.  The  full  proof  can  be  found  in  Appendix  C. 

We  adopt  the  following  convention:  at  the  beginning  of  each  time-slot,  the  scheduling 
decisions  are  taken,  and  transmissions  occur.  Then  new  arrivals  occur  at  the  end  of  the  slot 
(thus  new  arrivals  cannot  be  transmitted  in  the  same  slot). 

Let  the  queue-length  of  the  queue  for  link  I  and  channel  c  at  the  start  of  time-slot  t  be 
denoted  by  qf{t).  Let  the  rate-allocated  to  link  I  in  slot  t  over  channel  c  be  denoted  by 
x1{t).  Since  we  are  considering  single- interface  nodes,  at  most  one  of  the  x^(t)’s  is  non-zero 
for  a  link  1.  Furthermore  =  0  if  link  I  is  not  scheduled  over  channel  c  in  slot  t,  and 
xfft)  =  rf  else. 

Also  note  that  only  link-channel  pairs  with  qf{t)  >  rf  participate  in  the  scheduling 
procedure  during  time-slot  t. 

Therefore,  the  queue  dynamics  are  as  follows: 


Qi{t  +  l)  =  qf{t)  +  Xf{t)  -  xf{t)  where  \f{t)  = 


feec 


(5.25) 


We  define  the  following  Lyapunov  function: 


i&c  cec 


(5.26) 


*  \k&A{l)d&C  k  k&I'{l)  k 
This  Lyapunov  function  is  somewhat  similar  in  form  to  that  used  in  [120].  It  can  be 
shown  that  this  Lyapunov  function  satisfies  the  condition  stated  in  Lemma  42  (Lemma  2 
from  [85]).  This  proves  stability.  For  the  detailed  proof,  please  refer  to  Appendix  C.  □ 


Corollary  3.  When  fdg  =  I,  the  RPMMC  scheduler’s  efficiency  ratio  is  at  least: 


\C\ 

K\c\  -Fmax{l,7}|C| 
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Corollary  4.  The  efficiency-ratio  of  the  RPMMC  scheduler  is  always  at  least: 

( _ ^ _ 'i 

\K  +  max{l,7}  J 

Proof.  The  proof  follows  from  Theorem  13  and  (5.1).  □ 

5.8  On  Scheduling  with  Heterogeneous  Interfaces 

The  results  presented  in  this  chapter  pertain  to  scenarios  where  the  channels  have  heteroge¬ 
neous  characteristics,  but  the  interfaces  are  all  identical.  Thus,  it  is  of  interest  to  consider 
how  maximal  scheduling  algorithms  may  need  to  adapt  in  the  face  of  heterogeneous  in¬ 
terfaces,  each  of  which  may  have  constrained  switching  ability,  and  may  only  be  able  to 
operate  on  some  subset  of  channels.  The  key  distinction  lies  in  the  need  to  treat  each 
node-link  (pair  of  nodes  capable  of  direct  communication)  as  a  set  of  distinct  radio-links 
(corresponding  to  pairs  of  interfaces  that  could  be  used  for  communication).  If  a  maximal 
schedule  is  computed  in  a  manner  oblivious  to  the  interface  heterogeneity,  this  can  lead  to 
performance  degradation.  We  illustrate  this  via  a  very  simple  example; 

Consider  two  mutually-interfering  directed  links  A  ^  B  and  C  ^  D.  There  are  two 
channels  1  and  2  that  both  support  the  same  data-rate  r  over  both  links.  Node  A  has  two 
radios,  while  all  other  nodes  have  one  radio  each.  Nodes  A  and  C  both  generate  traffic 
at  a  constant  rate  r  —  e  (where  e  is  a  very  small  positive  constant).  It  is  easy  to  see  that 
7  =  0,ifr  =  l,(Ts  =  2  for  this  network.  Hence,  if  all  radios  were  identical  and  could  operate 
on  both  channels  1  and  2,  one  would  expect  any  maximal  scheduler  to  achieve  an  efficiency 
ratio  of  1  in  this  network. 

However,  in  the  considered  scenario,  the  radios  are  heterogeneous,  and  many  of  them 
have  constrained  switching  ability.  The  channel-sets  on  which  these  radios  can  operate  are 
depicted  in  Fig.  5.3.  The  optimal  scheduling  decision  in  this  scenario  is  to  operate  link 
A  ^  B  on  channel  1  and  link  C  ^  D  on  channel  2.  A  sub-optimal  scheduler  may  schedule 
A  ^  B  on  channel  2,  thereby  making  it  impossible  to  schedule  C  ^  D  simultaneously. 

Note  that  this  latter  schedule  is  a  valid  maximal  schedule.  However,  it  is  computed  in  a 
manner  oblivious  to  interface  heterogeneity,  and  consequently,  can  lead  to  a  very  substantial 
performance  degradation. 
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Figure  5.3:  Example  illustrating  drawbacks  of  oblivious  interface-selection 

This  motivates  the  importance  of  incorporating  awareness  of  interface  switching  con¬ 
straints  into  the  scheduling  algorithm. 

We  remark  that  it  is  indeed  possible  to  adapt  the  algorithm  of  Lin-Rasool  [74]  to  address 
heterogeneous  radios  (see  [8]).  We  expect  that  it  should  be  possible  to  similarly  adapt  the 
RPMMC  scheduler  to  heterogeneous  radios.  This  would  be  an  interesting  direction  for 
future  work. 

5.9  Discussion 

The  intuition  behind  the  RPMMC  scheduler  is  very  simple.  By  splitting  the  traffic  across 
channels  in  proportion  to  the  channel-rates,  each  link  basically  sees  the  average  of  all 
channel-rates  as  its  effective  rate.  This  helps  avoid  worst-case  scenarios  where  the  link 
may  end  up  being  repeatedly  scheduled  on  a  channel  that  yields  poor  rate  on  that  link. 
Though  exceedingly  simple,  the  algorithm  is  made  attractive  by  the  fact  that  no  information 
about  queues  at  interfering  links  is  required.  Furthermore  we  showed  that  the  efficiency- 
ratio  of  the  RPMMC  scheduler  is  always  at  least  (iiWnax{rTl)  ^)-  Note 

that  1  +  (3 s{\C\  —  l)  <  as  <  \C\.  Thus,  the  efficiency  ratio  of  this  algorithm  does  not  degrade 
indefinitely  as  f3s  becomes  smaller. 


114 


5.10  Future  Directions 


The  RPMMC  scheduler  provides  motivation  for  further  study  of  schedulers  that  work  with 
limited  information.  The  scheduler  of  Lin-Rasool  and  the  RPMMC  scheduler  represent  two 
extremes  of  a  range  of  possibilities,  since  the  former  uses  information  from  all  interfering 
links,  while  the  latter  uses  no  such  information.  Evidently,  using  more  information  can 
potentially  allow  for  a  better  provable  efficiency-ratio.  However,  the  nature  of  the  trade¬ 
off  curve  between  these  two  extremities  is  not  clear.  For  instance,  an  interesting  question 
to  ponder  is  the  following:  If  interference  extends  up  to  M  hops,  but  each  link  only  has 
information  upto  x  <  M  hops,  what  provable  bounds  can  be  obtained?  This  would  help 
quantify  the  extent  of  performance  improvement  achievable  by  increasing  the  information- 
exchange,  and  provide  insights  about  suitable  operating  points  for  protocol  design,  since 
control  overhead  can  be  a  concern  in  real-world  network  scenarios. 

Another  direction  for  future  work  consists  in  characterizing  network  topologies  in  which 
the  performance  of  greedy  maximal  scheduling  in  a  multi-channel  network  with  one  or  few 
interfaces  per  node  is  close-to-optimal. 
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Chapter  6 


Channel/Interface  Management  in 
a  Heterogeneous  Multi-Channel 
Multi-Radio  Network 


In  this  chapter,  we  describe  a  proof-of-concept  protocol  for  channel  and  interface  man¬ 
agement  in  a  heterogeneous  multi-channel  wireless  network.  Our  objective  has  been  to 
incorporate  awareness  of  radio  and  channel  heterogeneity  as  well  as  traffic-awareness  into 
the  channel  and  interface  management  procedure.  We  have  sought  to  leverage  the  insights 
from  our  theoretical  results  discussed  in  previous  chapters  of  this  dissertation,  as  well  as 
insights  from  prior  theoretical  work  in  the  literature.  While  we  have  designed  our  protocol 
in  the  context  of  802.11  networks,  with  certain  assumptions  on  node  configuration,  many 
aspects  of  the  design,  and  many  of  the  algorithms  used,  have  broader  relevance  for  a  wide 
range  of  networks  with  heterogeneous  radios  and/or  channels. 

We  begin  by  discussing  related  work  in  Section  6.1.  We  then  describe  the  general 
architectural  principles  of  our  approach  in  Section  6.2.  In  Section  6.3,  we  describe  the 
network  and  node  model.  In  Section  6.4,  we  provide  examples  of  various  kinds  of  network 
conflicts  that  may  need  to  be  addressed  by  a  channel  and  interface  management  protocol, 
and  then  discuss  the  protocol  design  in  detail  in  Section  6.5.  We  describe  simulation  results 
in  Section  6.6.  In  Section  6.7  we  discuss  some  observations  based  on  the  protocol  evaluation, 
and  conclude  in  Section  6.8  by  discussing  some  directions  for  future  work. 

6.1  Related  Work 

Protocols  and  architectures  for  multi-channel  networks  can  be  broadly  categorized  into  those 
intended  for  single-radio  devices,  and  those  intended  for  multi-radio  devices.  In  the  case  of 
single-radio  devices,  the  channel  coordination  problem  can  be  quite  complex  whereas,  with 
multi-radio  devices,  the  coordination  issues  are  made  somewhat  easier  to  address  by  the 
presence  of  many  radios. 

Many  protocols  have  been  proposed  for  channel-coordination  amongst  devices  having  a 
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single  radio  each.  A  useful  taxonomy  for  these  has  been  described  in  [84],  Some  protocols 
assume  that  all  nodes  are  synchronized  and  follow  a  common  hopping  sequence  when  not 
sending  data.  A  pair  of  devices  wishing  to  send  data  stop  hopping  after  negotiating  a  data- 
transfer,  and  stay  on  a  common  channel  till  it  is  over.  Then  they  again  start  hopping  as 
per  the  common  hopping  schedule.  Instances  include  CHMA  [111].  The  class  of  split-phase 
protocols  comprises  those  that  utilize  a  notion  of  a  negotiation  phase  during  which  nodes 
converge  to  a  common  channel  and  decide  what  channels  to  tune  to  for  a  window  of  time 
in  the  future.  Prominent  amongst  these  is  MMAC  [106],  which  uses  a  notion  of  ATIM 
window  (similar  to  IEEE  802.11  PSM)  to  negotiate  channels.  Many  proposals  fall  into  the 
category  of  multiple-rendezvous  protocols,  e.g.,  SSCH  [4],  McMAC  [105],  Dominion  [88]. 
In  these  protocols,  nodes  follow  channel-hopping  schedules  that  allow  them  to  converge 
with  each  other  sufficiently  often.  Of  these.  Dominion  also  includes  a  multi-channel  routing 
component. 

An  approach  termed  component  based  channel  assignment  is  proposed  in  [114],  wherein 
all  interfaces  lying  on  the  routes  of  intersecting  flows  are  assigned  the  same  channel.  This 
keeps  channel  switching  to  a  minimum. 

Recently,  there  has  been  much  interest  in  protocols/architectures  for  multi-channel 
multi-radio  networks.  Examples  of  multi-channel  multi-radio  testbeds  include  the  Net- 
X  project  [67,  18,  12],  a  testbed  at  UCSB  [96],  and  the  Quail  Ridge  Reserve  Mesh  Network, 
UC  Davis.  Of  these,  the  Net-X  testbed  is  relevant  to  our  work,  as  we  adopt  the  node 
configuration  used  in  Net-X. 

Many  protocols  have  been  proposed  to  incorporate  traffic  awareness  in  various  queueing 
and  scheduling  decisions,  both  for  single  and  multi-channel  scenarios.  Neighborhood  RED 
[122]  proposes  a  variant  of  the  RED  algorithm,  whereby  queues  at  nodes  within  two  hops 
are  also  taken  into  account,  and  not  just  the  local  queue.  Warrier  et  al.  have  proposed  a 
cross-layer  architecture  that  is  based  on  recent  theoretical  work  on  cross-layer  optimization 
[117]  Traffic-aware  channel  assignment  in  LANs  has  been  considered  in  [97].  Eor  LANs 
with  uncoordinated  access  points,  it  has  been  proposed  in  [82],  that  channel-hopping  can 
help  prevent  worst-case  scenarios,  and  provide  good  average  case  performance.  A  traffic- 
oblivious  joint  routing  and  scheduling  scheme  for  mesh  networks  has  been  proposed  in  [116]. 
Route/schedule  computation  is  centralized,  and  worst-case  congestion  is  minimized. 
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The  802.11  standard  provides  multiple  physical  layer  specifications,  and  NICs  for  these 
are  readily  available  off-the-shelf.  There  has  been  some  work  addressing  the  use  of  these 
radios  of  different  types.  Draves  et  al  [29]  have  considered  the  issue  of  routing  in  a  multi¬ 
channel  mutli-radio  mesh  network  where  nodes  are  equipped  with  one  radio  each  of  type 
802.11a  and  802. llg.  However,  they  do  not  consider  the  problem  of  channel  selection. 

The  use  of  heterogeneous  interfaces  to  handle  route  breakages  has  been  proposed  in 
[127].  In  this  work,  nodes  are  equipped  with  primary  802.11a  interfaces  and  secondary 
802.11b  interfaces.  TCP  flows  use  a  primary  path  comprising  the  802.11a  interfaces,  which 
is  discovered  via  a  reactive  routing  protocol.  A  proactive  routing  protocol  is  run  over  the 
secondary  interfaces.  When  a  link-breakage  is  detected,  the  TCP  traffic  can  be  immediately 
re-routed  over  the  secondary  path  while  a  new  primary  path  is  being  discovered. 

Joint  channel  assignment  and  routing  in  a  heterogeneous  multi-channel  multi-radio  wire¬ 
less  network  has  been  considered  in  [118].  This  work  targets  a  situation  very  similar  to  what 
we  have  considered  in  this  chapter,  and  is  closest  in  scope  to  our  work.  It  allows  for  both 
heterogeneity  in  the  operational  abilities  of  interfaces,  as  well  as  in  supported  channel  data- 
rates.  It  handles  both  single-radio,  and  multi-radio  devices.  A  joint  channel- assignment 
and  routing  scheme  (JCAR)  is  proposed.  However,  this  work  treats  the  route  for  each 
flow  as  a  sequence  of  interfaces,  and  therefore  does  not  consider  the  possibility  of  link- 
layer  data-striping.  Moreover,  it  seeks  a  solution  where  interfaces  switch  channels  only  over 
substantially  long  periods  of  time. 

The  channel  diversity  in  a  multi-channel  network  provides  opportunity  for  not  merely 
load-balance  but  opportunistic  selection  of  the  channel  with  better  channel  quality.  Op¬ 
portunistic  channel  selection  has  been  considered  in  MAC  protocols  such  as  MOAR  [52], 
DB-MCMAC  [14]  and  OMC-MAC  [130].  However  the  global  routing  implications  of  lo¬ 
cal  opportunism  in  a  multi-hop  wireless  network  have  not  been  studied.  Optimal  channel 
probing  strategies  for  a  single-user  multi-channel  system  have  been  studied  in  [40,  17].  The 
considered  systems  typically  comprise  one  transmitter,  capable  of  operating  on  N  channels, 
which  must  select  one  channel  for  transmission.  Self-organization  based  on  measurements 
is  considered  in  [53],  and  their  approach  consists  of  using  a  Gibbs  sampler.  Channel  quality 
and  rate-aware  routing  was  addressed  in  [23] . 
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6.2  General  Design/ Architectural  Principles 

We  begin  by  briefly  describing  the  general  design  and  architectural  principles  on  which  we 
have  based  our  protocol  for  multi-channel  multi-radio  wireless  networks. 

A  Route  as  a  Sequence  of  Nodes  A  node-link  is  a  pair  of  neighboring  nodes.  A  radio 
link  is  a  pair  of  radios  on  neighboring  nodes.  Thus,  a  node-link  comprises  a  set  of  radio-links, 
and  with  suitable  link-layer  strategies,  one  can  exploit  this  diversity /multiplicity.  We  adopt 
an  approach  of  single-path  routing  with  link-layer  data-striping.  Thus,  a  path  from  source 
to  destination  is  a  single  sequence  of  nodes  (and  hence  also  a  series  of  node- links).  When 
packets  need  to  be  transmitted  over  a  node-link,  the  link  layer  determines  which  radio(s) 
and  channel(s)  to  use.  Thus,  the  link-layer  can  perform  link-level  data-striping  if  many 
radios  are  available  at  both  transmitter  and  receiver.  Moreover,  when  there  are  multiple 
flows  that  pose  interference  or/and  interface  conflicts  for  each  other,  this  approach  allows 
flexibility  in  adapting  on  the  fly,  as  the  link  layer  can  make  packet  scheduling  decisions  at 
fine  granularity. 

Channel  Restriction  While  one  would  like  to  exploit  the  available  channel  diversity  to 
improve  throughput,  doing  so  effectively  would  require  some  mechanism  to  sample/probe 
channels,  as  well  as  exchange  of  information  about  channel  state/quality.  This  cost  can  be 
significant,  especially  if  the  number  of  available  channels  is  large.  Moreover,  in  a  distributed 
setting,  when  multiple  entities  act  independently,  opportunism  can  have  an  adverse  effect 
on  load-balance,  e.g.,  consider  a  worst-case  scenario  where  all  nodes  in  a  vicinity  decide 
that  channel  x  has  best  quality  and  start  using  that  channel  simultaneously. 

One  would  typically  expect  that  much  of  the  benefit  of  opportunistic  exploitation  of 
channel  diversity  can  be  obtained  by  having  the  choice  of  a  few  channels,  and  thus  a 
reasonable  solution  lies  in  restricting  the  operation  of  a  link  to  a  subset  of  all  possible 
channels  available  to  it  (a  channel  pool).  One  can  then  attempt  to  opportunistically  exploit 
diversity  amongst  channels  in  this  channel  pool.  We  note  that  some  prior  work,  e.g.,  [115], 
has  studied  this  issue  in  a  single-hop  setting  and  concluded  that  a  few  channels  indeed 
provide  a  good  trade-off  between  diversity- gain  and  probing  cost.  The  same  conclusion  is 
likely  to  hold  even  in  multi- hop  settings. 
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Moreover,  channel-restriction  has  the  potential  to  provide  a  degree  of  a  priori  load- 
balance  (since  different  links  will  have  different  channel  pools).  This  can  help  reduce  the 
possibility  of  worst-case  channel-selection  scenarios  link  the  one  mentioned  above,  while 
still  providing  enough  choices  to  each  link  for  good  load-balance.  Some  intuition  for  this 
can  be  derived  from  our  result  for  random  (c,  /)  assignment  described  in  Chapter  4,  as  well 
as  past  work  on  balls  and  bins  with  choices  [3,  83]. 

We  propose  the  following  simple  channel  restriction  policy;  each  interface  is  assigned  a 
small  pool  of  /  channels  for  substantial  periods  of  time.  The  channel  pools  are  chosen  and 
adjusted  so  that,  within  the  two-hop  neighborhood  of  any  interface,  each  channel  occurs  in 
the  pool  of  approximately  the  same  number  of  interfaces. 

The  current  channel  for  each  interface  is  selected  more  frequently. 

It  is  to  be  noted  that  the  poolsize  f  provides  a  eontrol  knob  to  tune  the  degree  of  dy¬ 
namism  of  the  protocol.  Setting  /  =  1  corresponds  to  a  largely  static  channel  assignment 
(where  interfaces  switch  channels  very  infrequently),  while  setting  f  =  c  corresponds  to  a 
fully  dynamic  assignment,  in  which  the  current  channel  may  be  chosen  from  the  entire  set 
of  possible  channels. 

Late  Binding  of  Packets  to  Channel/Interface  Since  we  intend  to  perform  dynamic 
channel  selection  over  intermediate  time-scales,  it  is  beneficial  to  defer  the  binding  of  an 
outgoing  packet  to  a  channel  and  interface  to  as  late  a  stage  as  possible  without  significantly 
affecting  efficiency.  This  allows  for  greater  flexibility  and  adaptivity. 

Channel  Cost  Formulation  Incorporating  Awareness  of  Traffic  Levels  and  Con¬ 
flicts  Two  kinds  of  conflicts  can  limit  performance  in  a  multi-channel  network: 

1.  Interference  Conflicts:  A  channel  becomes  the  bottleneck  due  to  traffic  overload 

2.  Interface  Conflicts:  A  radio- interface  becomes  the  bottleneck  due  to  an  overload  of 
traffic  it  is  expected  to  relay. 

Thus,  a  link  cost  metric  for  scheduling  should  try  to  capture  these  two  conflicts,  so  that 
channel/interface  selection  decisions  are  able  to  address  them  effectively. 

Use  of  limited  information  from  vicinity  A  wireless  transmission  can  create  interfer¬ 
ence  for  other  transmissions  over  a  distance  corresponding  to  many  hops,  depending  on  the 
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Figure  6.1:  General  Architectural  Template 

transmission  powers,  rates,  and  corresponding  SINK  requirements.  Moreover,  the  choice 
of  carrier-sense  threshold  also  affects  the  degree  of  spatial  reuse  achievable.  If  the  carrier- 
sense  threshold  is  conservatively  set  to  a  large  value,  a  single  ongoing  transmission  can  block 
other  transmissions  over  a  large  area  extending  well  beyond  its  two-hop  neighborhood.  If 
the  region  over  which  a  link  can  potentially  create  conflict  extends  over  K  hops,  where  K  is 
large,  then  it  may  not  be  feasible  to  provide  a  node  information  about  this  whole  region  due 
to  concerns  about  high  overhead,  as  well  as  large  delays,  because  of  which  the  information 
may  become  stale  by  the  time  it  is  received.  Thus,  it  is  desirable  to  operate  using  limited 
exchange  of  explicit  information,  and  use  implicit  feedback  mechanisms  to  infer  network 
and  channel  conditions.  Therefore,  in  the  proposed  design  approach,  nodes  only  have  ex¬ 
plicit  information  up  to  two  hops,  but  use  contention  on  a  channel  as  an  implicit  indicator 
of  traffic  levels. 

A  high-level  schematic  of  the  envisioned  framework  incorporating  the  elements  described 
above  is  depicted  in  Fig.  6.1. 

6.3  The  Model 

We  assume  a  node  configuration  similar  to  the  Net-X  Project  [64]  where  interfaces  are 
classified  as  belonging  to  one  of  the  following  two  categories; 

1.  R-interface:  A  R-interface  is  used  for  receiving  packets,  and  whenever  its  channel 
is  changed,  the  change  is  advertised  to  neighbors.  A  R-interface  is  also  used  for 
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transmitting  packets  that  are  to  be  sent  on  its  current  channel. 


2.  T-interface:  A  T-interface  is  used  for  transmitting  packets.  When  a  packet  is  to  be 
transmitted  to  a  next-hop  node,  a  T-interface  is  switched  to  one  of  the  R-channels  of 
the  next-hop  node,  and  used  to  transmit  the  packet. 

The  interfaces  can  be  of  type:  single-mode  802.11a,  single-mode  802. llg  and  multi-mode 
802.11ag.^ 

Each  node  is  assumed  to  either  have  at  least  one  R-interface  and  one  T-interface  of  type 
X  or  no  interface  of  type  x,  where  x  can  be  802.11a  or  802. llg.  A  multi-mode  802.11ag 
radio  can  be  present  as  a  T-interface,  and  can  be  counted  towards  each  type,  e.g.,  if  a  node 
has  one  R-interface  each  of  type  802.11a  and  802. llg,  and  a  T-interface  of  type  802. flag, 
then  this  is  a  valid  configuration.  Currently,  we  do  not  allow  multi-mode  R-interfaces. 

Note  that  the  above  classification  into  R-interfaces  and  T-interfaces  is  purely  a  link-layer 
characteristic,  based  on  how  the  link  layer  intends  to  utilize  each  interface;  each  interface  of  a 
particular  type  is  otherwise  identical,  and  has  the  same  physical  and  MAC  layer  properties. 

Adopting  this  dual-radio  framework  helps  avoid  connectivity  issues,  and  channel  co¬ 
ordination  problems  such  as  multi-channel  deafness  [79],  and  enables  us  to  focus  on  the 
scheduling  aspects  of  the  problem. 

At  each  node,  we  have  a  single  link-layer  entity  that  manages  all  interfaces  (which 
perform  independent  MAC  procedures).  Since  we  wish  to  perform  single-path  routing 
while  allowing  for  the  possibility  of  transparent  link-layer  striping,  we  require  all  interfaces 
of  a  node  to  have  the  same  IP  address.  To  avoid  changing  ARP,  all  interfaces  of  a  node  are 
also  assigned  the  same  MAC  address. 

Interfaces  are  assumed  to  be  capable  of  fairly  fast  switching.  More  specifically,  we 
consider  that  switching  between  channels  in  the  same  mode  takes  250/US  (this  is  consistent 
with  channel  switching  times  reported  in  recent  work,  e.g.,  [41]).  If  a  mode-switch  is  also 
required  while  doing  the  channel  switch,  then  we  assume  that  the  time  taken  is  500^s,  since 
a  mode-switch  might  typically  take  more  time  than  a  simple  channel-switch. 

We  have  designed  a  channel  and  interface  management  protocol  for  this  described  model. 
For  evaluation  with  multi-hop  flows,  we  use  manually  specified  routes,  wherever  needed. 

^As  we  mention  later,  802.11b  is  not  considered  separately,  as  we  currently  fix  the  802.11b/g  rate  at  2 
Mbps,  and  thus  the  two  are  effectively  the  same,  if  802. llg  is  operated  in  backward  compatibility  mode 
(which  is  what  we  assume). 
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Currently,  we  do  not  consider  dynamic  rate  adaptation.  The  data-rate  for  all  802.11a 
communication  is  6Mbps,  while  that  for  all  802. llg  communication  is  2  Mbps.  We  do 
remark  that  the  link  layer  algorithms  can  operate  in  the  presence  of  a  rate-adaptation 
algorithm,  with  suitable  link-rate  feedback  from  the  MAC.  However,  our  current  goal  is 
to  study  the  channel  and  interface  management  aspects  without  regard  to  interaction  with 
rate-adaptation.  Incorporating  a  suitable  auto-rate  fallback  algorithm  at  the  MAC,  and 
providing  appropriate  rate  feedback  to  the  link  layer,  would  be  an  interesting  direction  for 
future  work. 

The  RTS/CTS  mechanism  is  effectively  disabled  in  the  802.11  MAC  protocol  by  choosing 
a  very  high  value  for  RTS  .Threshold.  Physical  carrier  sense  is  used.  802. llg  uses  2 
Mbps  as  the  data  rate  for  all  packets  (including  broadcast  and  ACK  packets).  The  PLCP 
datarate  is  1  Mbps  for  802. llg,  while  it  is  6Mbps  for  802.11a.  802. llg  operates  in  backward 
compatibility  or  mixed- mode  and  uses  the  same  MAC  parameters  as  802.11b. 

Since  the  link  layer  may  perform  data-striping  over  a  link,  there  is  a  possibility  of  out- 
of-order  packet  delivery,  and  thus  reordering  of  packets  may  be  required.  Currently,  we  do 
not  address  this  issue,  as  reordering  can  also  be  done  at  the  receiving  transport  endpoint. 
However,  we  discuss  the  issue  of  implementing  a  reordering  buffer  at  the  link-layer  in  Section 
6.8. 

6.4  Interference  and  Interface  Conflicts 

As  was  discussed  in  Section  6.2,  the  channel  cost  metric  should  be  able  to  capture  both 
interference  and  interface  conflicts.  Before  we  move  on  to  describe  our  protocol,  and  how  it 
addresses  this  issue,  let  us  consider  a  few  illustrative  examples  in  the  context  of  the  specific 
network  and  node  model  we  are  considering.  In  these  examples,  each  node  has  one  802.11a 
R-interface  and  one  802.11a  T-interface,  and  for  the  purpose  of  simplicity,  we  assume  that 
ideal  TDMA  scheduling  is  possible.  The  transmission  rate  in  use  is  6  Mbps. 

Example  1.  Consider  the  situation  in  Fig.  6.2.  There  are  only  two  802.11a  channels 
available  for  use  (let  us  denote  them  by  1  and  2).  All  links  interfere  with  each  other. 

Consider  two  different  traffic  patterns: 

1.  Link  li  has  traffic- demand  6  Mbps,  while  links  I2  and  I3  have  traffic  demand  3  Mbps 
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Figure  6.2;  Example  1:  Interference  Conflicts 

each.  An  ideal  scheduler  can  meet  these  demands  by  having  h  operate  on  channel 
1  and  I2  and  I3  operate  on  channel  2.  A  traffic-unaware  static  distributed  channel 
assignment  strategy’s  best  solution  is  to  have  two  of  these  links  on  one  channel,  in 
a  manner  oblivious  to  actual  load.  Thus,  it  could  potentially  operate  f  and  I2  on 
channel  1  and  I3  on  channel  2,  resulting  in  throughput  degradation. 

2.  Each  link  Ii,l2,l3  has  a  single-flow  with  traffic- demand  4  Mbps.  An  ideal  scheduler 
can  have  links  f  and  I2  operate  over  channels  1  and  2  respectively,  and  have  I3  time- 
share  between  channels  1  and  2,  as  follows:  in  a  unit  interval  [0  ;  1]  the  following 
schedule  is  followed:  [0  :  |]  :  h  transmits  over  channel  1,  I3  transmits  over  channel  2; 
[|  :  |]  :  li  transmits  over  channel  1,  I2  transmits  over  channel  2;  [|  :  1]  :  I3  transmits 
over  channel  1,  I2  transmits  over  channel  2.  This  allows  all  traffic  demands  to  he  met. 
A  static  and  traffic-unaware  channel- assignment  strategy  would  not  be  able  to  achieve 
this. 

Now  consider  an  example  illustrating  a  potential  interface  conflict  and  how  it  can  be 
resolved: 

Example  2.  Consider  the  situation  in  Fig.  6.3.  There  are  3  802.11a  channels  available 
for  use.  There  are  two  flows:  X  ^  Y  and  X  ^  Z  with  traffic  demand  6  Mbps  each.  If  the 
R-interfaces  of  all  3  nodes  are  on  different  channels,  the  maximum  aggregate  throughput 
possible  is  6  Mbps.  However,  if  the  R-interface  of  either  Y or  Z  is  on  the  same  channel  as 
the  R-interface  of  X,  while  the  R-interface  of  the  remaining  node  is  on  another  channel, 
then  both  flows  can  get  6  Mbps,  since  X  can  use  its  R-interface  to  transmit  packets  to  one, 
and  its  T-interface  to  transmit  packets  to  the  other.  A  traffic-unaware  strategy  that  only 
considers  interference  conflicts  in  a  combinatorial  sense  (number  of  interfering  interfaces 
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802. 1  la  T-interface 


Figure  6.3:  Example  2:  Interface  Conflicts 


on  a  channel)  would  not  be  adequate  for  this;  in  fact,  such  a  strategy  would  typically  try  to 
assign  different  channels  to  all  3  R-interfaces. 


6.5  The  Heterogeneous  Multi-Channel  Link  Layer 
(HMCLL)  Protocol 

The  proposed  link  layer  protocol,  which  we  term  the  Heterogeneous  Multi-Channel  Link 
Layer  (HMCLL)  Protocol,  can  be  said  to  lie  in  Layer  2.5,  i.e.,  between  layers  2  and  3  in  the 
protocol  stack.  The  HMCLL  is  IP-aware.  This  IP-awareness  has  two  benefits: 

•  HMCLL  control  packets  have  IP  headers,  and  the  HMCLL  can  cache  IP-to-MAC 
mappings  in  the  ARP  table.  This  provides  resilience  to  issues  caused  by  ARP  losses 
(see  [15]  for  an  exposition  on  ARP-loss  related  problems  in  wireless  networks). 

•  The  HMCLL  can  provide  the  network  layer  with  a  cost  associated  with  a  link  to  a 
next-hop  node,  identified  by  the  network  layer  via  its  IP  address.  While  the  focus  of 
the  current  work  is  on  designing  an  intelligent  link  layer  protocol,  it  is  of  great  interest 
to  consider  future  work  where  the  link-layer  provides  an  abstracted  cost  metric  to  a 
routing  protocol.  We  discuss  future  directions  in  Section  6.8. 

The  HMCLL  protocol  aims  to  handle  scenarios  with  different  number  and  type  of  interfaces, 
and  channels  with  different  rates.  Many  of  the  HMCLL  algorithms  are  conceptually  formu¬ 
lated  in  fairly  general  terms  where  each  channel  is  characterized  by  the  rates  achievable  on 
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different  links  using  that  channel,  each  interface  is  characterized  by  the  set  of  channels  on 
which  it  can  operate,  and  the  relationship  between  channels  is  characterized  by  the  extent 
to  which  they  compete  for  interface-time  at  a  node  Thus,  they  could  be  applied  to  a 
wider  range  of  radio- types,  provided  an  appropriate  characterization  of  the  above  elements 
is  made  available  to  them. 

Note  that  different  channel  rates  may  arise  due  to  various  reasons,  e.g.,  (a)  as  a  result  of 
different  modulations  providing  different  transmission  rates  (e.g.  we  use  6  Mbps  for  802.11a 
and  2  Mbps  for  802. llg),  or  (b)  as  a  result  of  variable  channel  quality  leading  to  different 
packet  loss  rate  (and  hence  different  net  rate).  While  most  of  the  protocol  algorithms  are 
oblivious  of  the  reason  for  the  different  rates  (and  just  use  information  about  achievable 
rates  for  making  decisions),  we  do  remark  that  there  is  an  important  practical  distinction 
one  must  be  aware  of;  rate  differences  due  to  different  modulations  are  known  accurately 
a  priori,  whereas  rate  differences  due  to  variable  channel  quality  require  good  channel 
estimation  techniques  to  determine  with  fair  accuracy.  While  most  algorithms  used  by 
the  protocol  are  applicable  in  either  scenario,  achieving  good  performance  in  environments 
with  highly  dynamic  channel  conditions  will  require  that  good  estimates  of  achievable  rates 
be  available,  which  in  turn  would  require  improved  channel-estimation  techniques  beyond 
the  rudimentary  estimation  mechanisms  used  by  the  current  design.  Similarly,  the  current 
simplistic  neighborhood  management  would  need  much  improvement.  We  discuss  this  issue 
further  in  Section  6.8. 

6.5.1  Neighborhood  and  Channel/TrafRc  Statistics  Maintenance 

We  begin  by  introducing  some  terminology.  The  one-hop  neighborhood  of  a  node  u  is 
denoted  by  nbd{u),  and  its  two-hop  neighborhood  is  denoted  by  nbd2{u).  In  this  chapter, 
u  is  not  considered  to  be  included  in  nbd{u)  or  nbd2{u). 

Each  node  u  has  a  set  of  active  interfaces  Ai{u)  =  Mji{u)  U  Mt{u),  where  Mr{u)  and 
Mt{u)  are  the  R-interfaces  and  T-interfaces  respectively  of  node  u.  Let  C{x)  denote  the  set 
of  channels  on  which  interface  x  is  capable  of  operating.  Each  interface  has  a  type  denoted 
by  type{x)  which  uniquely  determines  the  set  of  channels  C(x)  on  which  x  can  operate^. 

^There  exist  other  aspects  to  the  relationship  between  channels,  e.g.,  adjacent  channel  interference,  and 
one  could  potentially  try  to  extend  the  characterization  to  include  these.  However,  that  is  beyond  the  scope 
of  the  current  work,  which  assumes  orthogonal  channels 

®For  instance,  we  currently  consider  three  types:  802.11a,  802. llg,  and  802. Hag.  Of  these,  only  802.11a 
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Each  R- interface  x  has  an  associated  subset  of  channels  called  the  channel-pool  V{x)  C  C{x) 

such  that  \T’{x)\  =  f.  The  current  channel  of  interface  x  is  denoted  by  c(x).  We  use  the 

notation  c{S)  where  5  is  a  set  to  denote  |J  {c(x)}.  An  interface  is  said  to  be  active  if  it  is 

x£S 

in  use  (i.e.,  has  not  been  deactivated  by  the  LL).^ 

The  link  layer  maintains  the  following  information: 

•  A  List  of  One  Hop  Neighbors:  This  contains  an  entry  for  each  node  in  nbd{u)  known  to 
u.  Each  neighbor  entry  has  a  LifeTime  field,  as  well  as  aLifeTime  and  hLifeTime  fields. 
It  is  also  marked  as  symmetric  or  asymmetric.  If  the  LL  receives  a  new  packet  from  the 
higher  layers  with  a  next  hop  node  that  is  currently  marked  asymmetric,  it  drops  the 
packet.  Each  entry  also  has  a  reachability  flag  for  each  of  802.11a  and  802. llg  based 
on  the  respective  lifetime  value;  these  a/g-specific  attributes  are  maintained  primarily 
to  provide  a  basic  binary  measure  of  achievable  rate  (0  or  the  raw  data-rate)  in  the 
absence  of  any  accumulated  rate  history. 

•  A  List  of  all  2-hop  Neighbors:  This  contains  an  entry  for  each  node  in  nbd2{u)  known 
to  u. 

•  Statistics  about  each  local  interface:  An  estimate  of  interface  TX-utilization  for  inter¬ 
face  X,  denoted  by  p(x),  i.e.,  the  fraction  of  time  the  interface  was  busy  doing  work 
related  to  transmitting  (contending,  transmitting,  switching)  is  computed.  Utilization 
is  computed  over  intervals  of  duration  Trassign,  and  an  average  utilization  estimate  is 
maintained  as  an  EWMA  updated  as  p{x)  =  0.25  *  p{x)  +  0.75  *  p{x). 

•  The  following  statistics  are  maintained  about  each  channel  on  which  some  local  inter¬ 
face  can  operate: 

—  Effective  Transmission  Rate  for  a  link,  denoted  by  r(tt,  v,  c):  Eor  each  packet  sent 
by  rt  to  x  over  channel  c,  the  MAC  provides  the  LL  feedback  on  the  number  of 
transmission  attempts  needed  {x{u,v,  c)),  as  well  as  the  raw  datarate  used  (R). 
The  success  rate  v,  c)  is  maintained  as  an  EWMA,  and  updated  as  follows: 

iflu,  V,  c)  =  0.25  *  ‘ip(u,  V,  c)  0.75  *  ^ - r  (6.1) 

x(x,  X,  c) 

and  802. llg  are  valid  types  for  R-interfaces. 

^Some  interfaces  may  be  deactivated  if  the  LL  is  unable  to  assign  all  local  interfaces  distinct  channels, 
e.g.,  when  the  number  of  channels  available  for  use  is  smaller  than  the  number  of  interfaces  at  the  node. 
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The  instant  effective  rate  is  rnew{u,  v,  c)  =  R*  .  The  LL  maintains  r(u,  v,  c) 

as  an  EWMA,  which  is  updated  as  follows: 

r{u,  V,  c)  =  0.25  *  r{u,  v,  c)  +  0.75  *  Vnewiu,  v,  c)  (6-2) 

If  the  last  update  of  r{u,  v,  c)  occurred  more  than  2*Tqjnpo  time  ago,  ip{u,  v,  c) 
is  reset  to  1  and  r{u,v,c)  is  reset  to  NO^RATE^HISTORY. 

—  Net  Data  Rate  for  a  link,  denoted  by  fi{u,v,c):  this  is  the  net  time  taken  to 
transmit  a  packet,  when  taking  into  account  the  time  spent  in  contention,  i.e., 
backoff,  etc,  as  well  as  any  retransmissions.  This  is  maintained  as  an  EWMA. 
Whenever  the  LL  gets  feedback  from  the  MAC  that  the  total  time  taken  in 
transmitting  a  packet  was  Hmw,  it  updates  the  estimate  as  ii{u,v,c)  =  0.9  * 
fi{u,v,c)oid  +  0.1  *  finew  R  the  last  update  of  fi{u,v,c)  occurred  more  than 
2  *  Tqjpipo  time  ago,  mu{u,v,c)  is  reset  to  NO.RATE_HISTORY. 

—  Average  Contention  Time  experienced  by  u  when  transmitting  a  packet  on  chan¬ 
nel  c,  denoted  by,  k{u,c).  This  is  also  maintained  as  an  EWMA.  Whenever  the 
LL  gets  feedback  from  the  MAC  that  a  packet  required  contention  time  k  on 
channel  c,  we  use  the  following  update  equation:  k  =  0.9  *  k  +  0.1  *  k. 

Note  that  all  rate  estimates  above  are  in  units  of  bits  per  second. 

Neighborhood  management,  as  well  as  channel  and  traffic  statistics  maintenance  are 
facilitated  by  exchange  of  link  layer  control  packets. 

Eor  each  v  G  nbd{u),  u  maintains  a  set  T {u,  v)  C  Mp{v),  which  is  the  set  of  R-interfaces 
of  V  that  u  would  be  willing  to  send  packets  to.  The  choice  of  T {u,  v)  can  be  used  to 
allow/disallow  link-layer  data-striping  (e.g.,  if  \Mr{v)\  >  1  but  \T{u,v)\  =  1,  then  this 
corresponds  to  no  data  striping).  Currently,  we  use  T{u,v)  =  Mp{v).  However,  in  the 
rest  of  the  description,  we  will  continue  to  use  the  term  T (u,  v)  to  highlight  that  the  link 
layer  algorithms  can  work  for  other  choices  of  T (tt,  v)  (of  course,  in  that  case,  an  additional 
algorithm  will  be  needed  to  select  T{u,v)). 

The  link  layer  also  maintains  a  system  of  queues  (described  later  in  this  chapter) .  These 
include  a  queue  of  outgoing  packets  to  each  next-hop  neighbor.  The  length  of  the  queue 
(in  bits)  for  neighbor  v  at  node  u  is  denoted  by  qnbr{u,v).  There  is  also  a  queue  for  each 
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channel.  The  length  of  the  queue  (in  packets)  for  channel  c  is  denoted  by  c). 

We  also  use  the  following  definitions  and  notation; 

The  minimum-rate  constant  0  is  a  small  constant  chosen  such  that  6  is  much  smaller 
than  the  typical  values  of  achievable  rates.  The  primary  purpose  of  9  is  to  avoid  division- 
by-zero  anomalies  when  computing  various  quantities  of  interest.  In  the  current  design,  we 
use  6  =  1  (as  the  typical  rate  values  are  of  the  order  of  10®  in  bits/sec). 

The  ratesum  for  a  link  (u,  v)  is  denoted  by  a{u,  v)  and  defined  as: 

a(u,v)=  ^  r(u,v,c(y)) 

yG'T  (u,v) 

Intuitively,  the  significance  of  the  ratesum  is  that  the  LL  needs  to  estimate  the  load  on 
each  channel  in  the  near  future.  To  do  so,  it  pretends  that  each  neighbor  v  splits  traffic 
it  sends  to  u  across  channels  in  T (u,  v)  in  proportion  to  the  channel-rates,  and  therefore, 
the  ratesum  plays  a  role  in  computing  various  estimates,  as  will  be  evident  {v  may  not 
necessarily  split  traffic  in  this  manner,  but  it  serves  as  a  reasonable  hint  for  LL  decisions). 
Note  that  this  is  reminiscent  of  the  RPMMC  scheduler  described  in  Chapter  5,  from  which 
we  drew  intuition  for  this  approach. 

The  link-layer  at  u  tracks  the  number  of  bits  sent  to  v  over  intervals  of  duration  T^. assigns 
denoted  by  s{u,v).  Average  sent  bits  for  link  {u,v)  are  denoted  by  s(u,v),  and  maintained 
as  an  EWMA.  At  the  end  of  every  period,  s(u,  v)  is  updated  as; 

s(u,  v)  =  0.25  *  s(u,  v)  -|-  0.75  *  s{u,  v) 

Interface-conflict  cost  for  channel  c  over  link  (u,  v)  is  defined  as  follows  (in  the  following 
text  iL  is  a  suitably  chosen  threshold  constant): 

1.  If  qnbr{u,v)  <  K  then  x(tt,u,c)  =  0 

2.  If  qnbr{u,v)  >=  K  : 

(a)  If  c  is  an  R-channel  of  u,  i.e.,  there  is  x  G  Mr{u)  such  that  c{x)  =  c,  then  it  is 
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defined  as: 


X{u,v,c) 


=  E 

w£nbd{u) 


Qnbr 


max{o'(w,  v)j  0} 


rassign 


{p{x)  -  0.8)_ 


-^(3  y&T{u,w):c{y)=c) 


(b)  If  c  is  not  an  R-channel,  let  S{b)  C  Mt{u),  be  the  set  of  T-interfaces  of  u  that 
can  operate  on  a  channel  b.  Then; 


x{u,  V,  c)  =  h{u,  V,c)  +  U {u,  c)Ih(u,v,c)>H 
where 

/ 

1 


xgiS(c) 


E  E 

w£nbd{u)  y£'T(u,w) 

1  c{y)^c{Mii{u)) 

\  Ac{y)&C{x) 


Qnbr 


max{a{u,v),e}\S{c{y))\ 


J 


U{u,c)  =  ~  0-8)+  and  H  is  a  suitably  chosen  threshold 


To  provide  some  intuition  for  the  relevance  of  this  quantity,  it  provides  an  estimated  measure 
of  the  amount  of  traffic  (normalized  by  rate)  that  contends  for  interface  time  at  sending 
neighbor  v  on  the  interface(s)  that  are  used  to  send  packets  on  channel  c.  The  utilization- 
based  component  is  included  primarily  because  when  we  have  TCP  traffic,  the  queues  may 
never  become  large  enough  to  trigger  a  change  in  channel  assignment;  in  those  scenarios 
tracking  interface  utilization  becomes  important,  as  a  heavily  utilized  interface  implies  a 
large  conflict  cost. 

The  local  interface  conflict  seen  by  channel  c  at  node  u  is  denoted  by  Xiocai{u,c)  and 
defined  as; 


1.  If  c  is  the  current  channel  of  a  local  R-interface,  Xiocai{u,c)  =  0. 


2.  If  c  is  not  an  R-channel: 


Xlocaliu,c)  — 


1 


E 


|5(c)|  ^ 

'  ^  ^'a;e5(c)  deC(x) 

d^cAd^c(M^{u)) 


\S{d)\ 


(6.3) 


where  S{b)  denotes  the  set  of  T-interfaces  at  the  local  node  u  that  can  operate  on 
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Name 

Description 

Tllinfo 

Used  to  determine  interval  between  consecutive  LLINFOs 

Jllinfo 

Used  to  determine  random  jitter  between  consecutive  LLINFOs 

TqinFO 

Used  to  determine  interval  between  consecutive  QINFOs 

Jqinfo 

Used  to  determine  random  jitter  between  consecutive  QINFOs 

^pool 

Interval  between  invocations  of  Channel  Pool  Management  Algorithm 

T 

^  rassign 

Base  interval  between  execution  of  R-channel  selection  algorithm 
at  an  interface  (a  random  jitter  gets  added  to  it) 

NBR^TTL 

Maximum  Time-To-Live  of  a  neighbor  entry 

IFR.TTL 

Maximum  Time-To-Live  of  a  2-hop  neighbor  entry 

K 

Threshold  value  used  in  computing  y  (unit  is  bits) 

H 

Threshold  value  used  in  computing  y  (unit  is  seconds) 

^inertia 

Minimum  difference  in  channel  cost  required  for  new  R-channel  selection 
Used  to  provide  hysteresis  in  R-channel  selection  decision;  Sinertia  >  0 

^min 

Used  to  provide  hysteresis  in  R-channel  selection  decision 

^comb 

Used  to  determine  whether  R-channel  selection 
should  use  combinatorial  criteria 

Table  6.1:  Protocol  Parameters 


channel  b. 

The  intuition  behind  xiocai{u,  c)  is  that  it  provides  a  quantification  of  the  conflict  faced  by 
packets  bound  to  channel  c  from  packets  bound  to  channels  that  compete  with  c  for  local 
interfaces. 

Total  incoming  data  score  for  interface  x  G  Mji(u)  with  respect  to  channel  b  is  defined 
as: 

_ s{v,u)  +  qnbr{v,u) _ 

max{[cj(u,  u)  —  r(v,  u,  c{x))  +  r{v,  u,  b)],  9} 

Incoming  queue  score  for  an  R-interface  x  at  node  u  is  defined  as: 


Incoming{x,  b)  =  ( 


r]{x) 


E 

v^nbd(u) 


qnbr{v,u) 

max{iT(u,  u),  6} 


r]{x)  provides  an  estimated  measure  of  the  amount  of  traffic  queued  at  neighbors  of  u  that 
is  expected  be  sent  to  interface  x. 

For  clarity,  various  parameters  used  by  the  LL  are  tabulated  in  Table  6.5.1. 


Link  Layer  Control  Packets 

The  link  layer  sends/receives  the  following  control  packets: 
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1.  LLINFO:  This  packet  is  broadcast  by  each  node  u.  Thus  a  copy  is  sent  on  each 
channel  c  such  that  some  interface  of  u  can  transmit  on  c.  The  LLINFO  is  sent  after 
intervals  of  duration  MAXiTiuNFO-,  (0.15>i=m))  +  X,  where  m  is  the  total  number  of 
channels  available  to  the  network  (on  which  copies  of  the  LLINFO  may  possibly  need 
to  be  sent),  and  X  is  a  random  variable  uniformly  distributed  in  [0,  Jllinfo]-  It  may 
also  be  triggered  by  events  that  require  fresh  information  propagation  (e.g.,  a  change 
of  an  R-interface’s  channel,  or  pool  membership).  The  contents  of  an  LLINFO{u) 
packet  are  as  follows; 

•  Sequence  number 

•  Number  of  active  R-interfaces 

•  For  each  active  R-interface  x  G  Mf,{u): 

ID{x),type{x),\V{x)\,c{x),{b\b  G  'P{x)},r]{x) 

•  For  each  v  G  nbd{u): 

seqno,yy  G  Mr{v)  :  {ID{y),type{y),  \V{y)\,c{y),{b\b  G  V{y)},r]{y)} 

Though  in  our  current  simulator  implementation,  we  use  a  globally  unique  ID{x) 
for  each  interface  x,  we  remark  that  one  only  requires  that  each  node  maintain  a 
locally-unique  ID  for  each  of  its  interfaces,  since  the  pair  (nodelP,  ID)  then  provides 
a  globally  unique  identification  for  each  interface. 

2.  QINFO:  A  QINFO(u  ^  v)  packet  is  unicast  by  each  node  u  to  some  or  all  neighbors 
in  situations  where  the  number  of  channels  is  greater  than  1  and  the  poolsize  is  also 
greater  than  1.  The  QINFO  sending  routine  is  invoked  after  intervals  of  duration 
Tqinfo  +  where  X  is  a  random  variable  uniformly  distributed  in  [0,  Jqinfo]- 
To  reduce  overhead,  if  \nbd{u)\  <  5,  u  sends  a  QINFO  to  each  v  G  nbd{u)  that  is 
a  symmetric  neighbor,  else  it  sends  a  QINFO  to  those  symmetric  neighbors  v  for 
which  qnbr{u,  v)  +  s{u,  v)  >  5000  (note  that  the  unit  is  bits).  This  packet  contains  the 
following  information: 

•  Length  of  outgoing  queue  to  neighbor:  qnbr{u,v)  and  recently  sent  data  s{u,v) 

•  Number  of  active  R-interfaces  at  v  known  to  u  (this  will  be  |M/j(u)|  unless  u  has 
wrong  information  about  v) 
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•  For  each  R- interface  y  G  Mr{v): 

\'P{y)\,  c{y),  V  c  G  V{y):  r{u,v,c),K{u,c),x{u,v,c) 

3.  CINFO:  A  CINFO(u  ^  v)  is  sent  by  tt  to  r;  G  nbd{u)  if  u  receives  a  QINFO  from 
neighbor  v  containing  incorrect  information  about  ii’s  interfaces.  The  contents  of  a 
CIN FO{u)  packet  are  as  follows: 

•  Sequence  number 

•  Number  of  R-interfaces  of  u 

•  For  each  R-interface  x  G  Mr{u)\  ID{x),  type{x),  \V{x)\,  c{x),  {b\b  G  V{x)},r]{x) 

4.  PROBE:  A  probe  packet  is  a  broadcast  packet  which  is  periodically  sent  with  the 
sole  purpose  of  estimating  contention  on  each  channel.  This  packet  does  not  contain 
any  information. 

The  sequence  numbers  for  the  LLINFO  and  CINFO  packets  are  drawn  from  the  same  32-bit 
sequence  number  space,  and  the  sequence  number  is  incremented  after  each  packet  is  sent. 
QINFO  and  PROBE  packets  have  no  sequence  number. 

The  link  layer  at  node  u  updates  its  local  information  on  receipt  of  control  packets  in 
the  manner  described  below: 

LLINFO:  Whenever  an  LLINFO  is  received  from  v,  if  u  is  not  already  in  the  neighbor- list, 
a  new  entry  is  created.  The  LifeTime  field  of  the  (new  or  pre-existing)  neighbor  entry  is  set 
to  NBR-TTL.  If  an  LLINFO  is  received  by  u  from  v  on  an  802.11a  channel,  it  marks  v  as 
reachable  using  802.11a,  and  sets  the  aLifeTime  field  as  NBR_TTL.  Similarly,  if  an  LLINFO 
is  received  on  an  802.11b/g  channel,  it  marks  v  as  reachable  using  802.11b,  and  sets  the 
bLifeTime  field  as  NBR^TTL.  The  aLifeTime  and  bLifeTime  fields  are  refreshed  whenever 
LLINFO  packets  are  received  on  the  appropriate  channels.  A  periodic  timer  checks  for 
expired  entries.  If  an  entry  expires,  the  corresponding  reachability  flag  is  set  to  false.  In 
the  absence  of  any  other  feedback,  this  reachability  information  is  used  to  determine  the 
achievable  rate  from  u  to  u  on  a  channel  c.  We  remark  that  this  approach  is  flawed  in  that 
u  receiving  a  packet  from  v  indicates  that  u  is  reachable  from  v  and  not  that  v  is  reachable 
from  u.  Thus,  this  approach  basically  inverts  the  reachability  information.  However,  it 
provides  a  low-overhead  way  to  ensure  that  unless  both  nodes  receive  packets  from  each 
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other,  the  link  will  be  marked  as  asymmetric,  and  the  LL  will  not  accept  any  new  packets 
from  higher  layers  to  send  to  this  neighbor.® 

If  the  sequence  number  on  this  packet  is  not  smaller  than  or  equal  to  the  last  sequence 
number  received  from  v,  the  interface  and  pool-channel  information  is  overwritten,  and  the 
neighbor  information  is  also  processed,  else  the  packet  is  discarded  after  refreshing  lifetime 
and  reachability  information. 

When  an  LLINFO  is  received  from  some  neighbor,  containing  a  record  for  v  as  2-hop 
neighbor,  if  v  is  not  already  in  interferer-list,  a  new  entry  is  created.  The  lifetime  of  the 
(new  or  pre-existing)  interferer  entry  is  set  to  MAX-IFR-TTL.  If  v  is  also  an  existing  1-hop 
neighbor,  and  the  sequence  number  on  this  entry  is  not  smaller  than  or  equal  to  the  last 
sequence  number  associated  with  u’s  entry,  the  interface  and  pool-channel  information  is 
overwritten.  For  2-hop  neighbor  entries,  it  is  always  overwritten  (this  can  be  extended  to 
perform  the  sequence  number  check  on  existing  2- hop  neighbors  too). 

If  the  received  LLINFO  leads  to  a  change  in  important  information  about  the  neighbor’s 
interfaces  (i.e.,  number  of  R-interfaces,  or  current  channel  of  an  R-interface) ,  a  new  LLINFO 
is  sent  out  to  propagate  the  changed  information  to  other  neighbors.  The  sending  of  a  fresh 
QINFO  to  this  neighbor  may  also  be  triggered.  Moreover,  if  the  LLINFO  indicates  that 
an  R-interface  of  a  neighbor  v  has  changed  its  channel  from  Coid  to  Cnew^  any  packets  with 
next-hop  v  that  are  enqueued  in  Qch{u,Coid)  are  flushed. 

QINFO:  Whenever  a  QINFO  is  received  from  a  neighbor  v,  if  v  is  not  in  u’s  neighbor- 
list,  no  action  is  taken.  If  v  is  indeed  in  the  neighbor- list,  information  in  QINFO  overwrites 
all  information  received  from  previous  QINFO  packets.  Also,  depending  on  whether  it 
was  received  over  an  802.11a  channel  or  an  802.11b/g  channel,  the  aLifeTime  or  bLifeTime 
field  is  reset  to  NBR_TTL,  and  the  corresponding  reachability  flag  is  also  set.  The  incoming 
queue  information  stored  from  a  QINFO  expires  after  a  certain  interval  (the  LL  runs  a  timer 
that  periodically  checks  when  the  last  QINFO  was  received  from  a  neighbor.  If  the  time 
elapsed  since  the  last  QINFO  is  greater  than  3  *  Tq/atfO)  the  information  about  qnbriv,u) 
and  s{u,v)  is  reset  to  0). 

®In  highly  dynamic  situations,  where  the  status  of  a  neighbor  may  fluctuate  between  symmetric  and 
asymmetric,  this  can  lead  to  an  incorrect  view  and  resultant  loss  of  performance.  It  can  be  improved  upon 
by  including  information  in  the  LLINFO  packet  as  to  whether  packets  were  received  from  a  neighbor  on 
802.11a  and/or  802. llg  in  the  recent  past,  and  using  the  information  received  about  oneself  from  one’s 
neighbor  to  assess  directional  reachability  and  determine  the  default  achievable  rate. 
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CINFO:  If  a  CINFO  is  received  from  v  with  a  fresher  sequence  number  than  the  last  one 
received  from  v,  the  interface  and  channel  pool  information  in  the  CINFO  overwrites  prior 
information.  A  CINFO  receipt  can  also  be  used  to  assess  a/b-reachability. 

6.5.2  Interface  Management 

As  has  been  described  earlier,  interfaces  are  classihed  as  being  either  R-interfaces,  or  T- 
interfaces. 

Since  all  interfaces  at  a  node  are  assigned  the  same  MAC  address,  but  have  independent 
MAC  procedures,  it  is  important  to  take  care  that  at  any  time  instant,  if  some  R-interface 
of  node  u  is  tuned  to  a  channel  c,  then  no  other  R-interface  or  T-interface  of  u  should  be 
tuned  to  c  at  that  time.  Otherwise,  the  following  undesirable  scenario  may  possibly  occur: 
suppose  neighbor  v  is  sending  data  to  u  on  channel  c.  Since  u  has  two  interfaces  tuned  to 
channel  c,  and  both  have  the  same  MAC  address,  they  will  both  receive  the  packets,  and 
believe  that  they  are  the  intended  recipients.  Thus,  they  will  both  send  ACKs.  As  a  result, 
the  ACKs  may  collide,  in  which  case,  v  would  consider  the  packet  lost,  and  retransmit. 
Repetition  of  the  same  could  lead  to  throughput  degradation.  The  HMCLL  protocol  tries 
to  avoid  the  possibility  of  an  R-interface  and  another  interface  being  tuned  to  the  same 
channel  simultaneously,  except  for  rare  and  brief  transient  periods  that  may  arise  when 
one  or  more  interfaces  are  switching.  While  there  may  potentially  be  occasional  periods 
when  more  than  one  T-interfaces  are  on  the  same  channel,  this  does  not  cause  the  wasteful 
transmission  problem  due  to  multiple  ACKs,  as  packets  intended  for  a  node  are  sent  only  on 
the  channel  of  an  R-interface.  If  two  T-interfaces  happen  to  each  be  on  the  same  channel, 
physical  carrier  sense  addresses  the  issue  that  only  one  of  them  should  transmit  at  a  time. 
Thus,  while  such  a  scenario  may  sometimes  lead  to  a  waste  of  interface  time  (if  there  are 
packets  waiting  to  be  sent  on  another  channel  that  the  interface  can  operate  on),  this  does 
not  cause  any  serious  issues. 

Except  for  link  layer  control  packets,  packets  received  on  a  T-interface  are  discarded  by 
the  LL,  to  avoid  the  possibility  of  receiving  duplicate  packets  (primarily  true  for  broadcast 
packets).  However  link  layer  control  packets  are  processed  in  the  same  way  as  packets 
received  on  an  R-interface.  This  helps  provide  resilience  to  loss  of  control  packets  sent 
on  the  R-interface’s  channel.  It  does  not  affect  correctness  as  the  operations  performed 
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on  receipt  of  a  control  packet  are  idempotent  (new  information  in  a  packet  completely 
overwrites  previous  information) .  The  possibility  of  a  delayed  control  packet  being  received 
and  causing  stale  information  to  overwrite  newer  information  is  made  negligible  by  using 
sequence  numbers  for  the  control  data  sent  by  any  single  neighbor,  and  ignoring  packets 
with  a  sequence  number  smaller  than  or  equal  to  the  last  known  sequence  number  (where 
smaller  is  defined  as  in  [31]). 

R-Interface  Management 

Following  the  channel  restriction  approach  we  described  in  Section  6.2,  we  associate  with 
each  interface  a  pool  of  channels,  from  which  the  current  channel  is  dynamically  selected. 
Thus,  the  R-interface  management  has  two  aspects,  viz.,  channel  pool  management,  and 
R-channel  selection.  We  now  describe  each  of  these. 

Channel  Pool  Management  Recall  that  C{x)  denotes  the  set  of  channels  on  which 
interface  x  is  capable  of  operating,  each  R-interface  x  has  an  associated  channel-pool  V{x)  C 
C{x)  such  that  \V{x)\  =  /,  and  the  current  channel  of  an  interface  x  is  denoted  by  c(x). 
Note  that  one  could  potentially  allow  different  pool  sizes  for  different  interfaces,  but  for 
simplicity,  this  is  currently  a  global  constant  for  all  interfaces  of  a  particular  type. 

In  keeping  with  the  objective  of  a  priori  load-balance,  it  is  desirable  that  the  channels 
be  equitably  distributed  across  pools,  such  that  in  any  vicinity  all  channels  occur  in  roughly 
the  same  number  of  pools. 

We  use  a  probabilistic  mechanism  for  pool  management. 

At  the  time  of  starting  up,  each  interface  is  assigned  a  set  of  /  channels  chosen  uniformly 
at  random  from  all  such  possible  /-subsets.  Progressively,  as  LLINFO  packets  are  received 
from  neighboring  nodes,  the  Neighbor  Table  gets  populated  with  information  about  the 
channel-pools  of  the  R-interfaces  of  these  nodes.  The  Channel  Pool  Manager  uses  a  timer 
that  is  scheduled  at  start-up  after  an  interval  uniformly  distributed  between  0  and  Tpo^i 
seconds,  and  thereafter  rescheduled  every  Tpooi  seconds.  The  initial  random  interval  serves 
to  desynchronize  the  pool-selection  decisions  of  different  nodes.  Whenver  the  timer  expires, 
the  procedure  described  in  Algorithm  1  is  executed. 

In  the  current  design,  the  periodic  channel  pool  management  algorithms  of  all  R- 
interfaces  at  node  u  use  the  same  timer  (i.e,  they  are  all  executed  sequentially  whenever 
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Algorithm  1  Channel  Pool  Management  Algorithm  (Interface  x) 

I (x)  <—  the  set  of  all  R-interfaces  within  2  hops  of  interface  x 
for  all  c  G  C(x)  do 

n{c)  ^  \{y\y  G  I{x)  U  {x},c  G  V{y)}\ 

end  for 

cGC{x) 

Cmin  ^  argmin  n(c) 
c^C{x)\V{x) 
n[c)<n 

if  Cmin  is  not  unique,  choose  one  of  the  candidates  uniformly  at  random  as  Cmin 
m  ^  {y\y  G  /(x)  U  {x},  c^m  e  C{y)}\ 

Cmax  ^  argmax  n(c) 
c£P{x) 

changeflag  <—  0 

if  n{cmax)  >  n  and  n{cmax)  >  n{cmin)  +  1  then 

^  m 

if  Cmax  =  c(x)  then 
2 

end  if 

R  <—  random  number  uniformly  distributed  between  0  and  1 
R  <  p  then 

P(x)  ^  {V{x)  \  {Cmax})  {]{Cmin] 
changeflag  <—  1 

end  if 
end  if 

if  changeflag  then 

cancel  x’s  running  R-channel  assignment  timer  and  reschedule  to  invoke  an  R-channel 
selection 

end  if 


137 


the  timer  expires).  However,  this  behavior  can  be  altered  if  necessary. 

We  remark  that  our  algorithm  for  pool-management  bears  similarity  to  the  algorithm 
for  minimum  conflict  coloring  in  [32]),  and  the  algorithm  for  channel  assignment  in  Net-X 
[64].  Also  related  is  the  probabilistic  distributed  learning  algorithm  for  channel  assignment 
described  in  [72]. 

Ideally,  we  would  like  the  pool  membership  to  stabilize  after  a  brief  period  of  churn,  with 
further  changes  occurring  rarely.  However,  due  to  the  distributed  and  probabilistic  nature  of 
the  algorithm,  the  channel  pool  membership  can  exhibit  quasi-stable  behavior,  i.e.,  after  a 
brief  initial  period  of  pool-adjustment,  the  pool  membership  may  either  fully  stabilize,  or  it 
may  largely  stabilize  with  occasional  pool  membership  changes  still  happening  at  relatively 
low  rate. 

It  is  to  be  noted  that  it  is  important  to  introduce  some  probabilistic  damping  in  the 
pool-management  procedure  to  achieve  good  stability  properties.  One  can  conceive  of 
many  possible  formulations  for  the  damping  probability,  which  can  aim  at  reducing  the 
possibility  of  many  interfaces  including  or  evicting  the  same  channel  at  around  the  same 
time.  What  we  use  in  the  current  design  (see  Algorithm  I)  is  one  such  formulation,  which 
intuitively  tries  to  reduce  the  possibility  of  the  same  channel  being  included  in  the  pools  of 
many  nearby  interfaces  at  around  the  same  time.  Other  possibilities  include  the  damping 
probability  formulation  used  in  [64]  for  channel- assignment,  which  intuitively  tries  to  reduce 
the  possibility  of  nearby  interfaces  on  the  same  channel  switching  to  different  channels  at 
around  the  same  time  (and  can  be  suitably  modified  and  applied  to  channel  pools).  Since 
the  pools  are  initially  chosen  uniformly  at  random,  the  decisions  only  involve  a  two-hop 
view,  and  they  occur  in  a  staggered  manner  (due  to  the  initial  desynchronization),  the 
protocol  performance  with  many  such  variant  formulations  is  expected  to  be  similar,  since 
the  pool  membership  would  typically  adjust  and  becoming  stable  or  quasi-stable  after  a 
brief  post-startup  period  of  churn. 

R-Channel  Selection  The  R-channel  selection  algorithm  is  designed  on  the  premise  that 
all  selection  decisions  are  sequential  and  staggered  at  different  nodes. 

To  reduce  the  chance  of  inadvertant  synchronization,  the  protocol  incorporate  an  el¬ 
ement  of  random  jitter  in  the  assignment-interval.  Thus,  each  interface  has  a  R-channel 
re- assignment  timer  that  is  rescheduled  over  duration  T^. assign  +  X,  where  X  is  a  random 
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variable  uniformly  distibuted  over  [0,  Jrassign\- 

For  simplicity,  we  currently  use  globally  constant  values  for  Trassign  and  Jr  assign- 
The  channel  cost  metric  for  channel  b  computed  for  interface  x  of  node  u  has  four 
components; 

1.  Explicitly  known  interference  conflict  cost; 

Ceincix,  h)  =  ^  ^  ^  r]{y)  (6.4) 

y&MR{v) 

c(y)=b 


2.  Interface  conflict  cost 


Cifci^X,  6)  — 


1 


Tr 


E 


ixiv,u,b)  -  D{v,u,b,x))_ 


rasstqn  ,  s 
v^nbdiu) 

g„brj,u)>0 


(6.5) 


where  D{v^  ti,  6,  x)  = 


_ 9.nhr{'^  _ 

m.a,x{cr{v,u)—r{v,u,c{x))-\-r{v,u,b),0} 


if  c{x) 


(b  ^  c(M/j(u)))],  and  is  0  else. 


6orif  [(c(x)  ^  c{Mji{v)))A 


The  intuition  behind  subtracting  D{v,u,b,x)  from  xiv,u,b)  is  that  the  latter  may 
sometimes  include  traffic  intended  for  interface  x.  This  should  not  be  counted  as  a 
cost  as  is,  as  even  after  a  channel  switch,  one  might  typically  expect  the  same  amount 
of  traffic  (in  bits)  to  be  re-directed  to  whatever  new  channel  x  may  switch  to  (although 
rate  difference  between  the  channels  should  be  considered).  We  also  remark  that  the 
specific  definition  of  D{v,  u,  b,  x)  is  driven  by  the  fact  that  any  R-interface  x  is  single¬ 
mode,  and  thus  all  channels  in  C{x)  can  be  operated  on  by  exactly  the  same  set  of 
T-interfaces  at  a  neighbor  v. 


3.  Contention  cost  (this  component  helps  capture  interference  beyond  the  two  hop  neigh¬ 
borhood  which  is  not  captured  by  the  explicit  interference  cost,  and  also  captures 
interference  conflicts  not  reflected  in  queue-lengths); 


Let  Wv  =  qnbr{v,  u)  +  s{v,  u) 
37.5  [  1 


CiincipC')  b)  —  \ 


T 

j.  r, 


0 


E 


venbd(u)  v&nbd{u) 


X)  WvK{v,b) 


if  Wv  >  0  (6.6) 

v£nbd{u) 

else 
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4.  Expected  cost  of  traffic  incoming  to  itself: 


Cgelf 


Incoming{x,  b) 


Trassign 


The  cost  of  a  channel  6,  as  computed  by  R-interface  x  of  node  u  is  given  by: 


(6.7) 


Cost{x,  h)  =  Ceincix,  b)  +  Cif^X,  b)  +  Ciinc{x,  b)  +  Cself{x,  b)  (6.8) 

The  R-channel  is  selected  using  the  procedure  in  Algorithm  2,  which  returns  the  chosen 
channel.  If  the  chosen  channel  is  different  from  the  current  channel,  a  switch  is  initiated. 

6.5.3  Packet  Scheduling:  Channel  and  Interface  Binding 

The  channel  and  interface  selection  decisions  are  decomposed  into  two  separate  decisions, 
viz.,  channel  selection,  and  interface  selection,  which  are  coupled  through  the  channel  queue 
occupancies,  and  the  local  interface  conflict  score  Xiocai  (which  is  a  function  of  the  channel 
queue  occupancies,  and  the  number/type  of  interfaces  available  at  the  node). 

The  channel  binding  decision  is  performed  by  a  channel  scheduler  (denoted  by  CH- 
scheduler),  and  the  interface  binding  decision  is  performed  by  an  interface  scheduler  (de¬ 
noted  by  IF-scheduler). 

The  structure  of  the  packet  scheduling  component  is  depicted  in  Fig.  6.4. 

The  link-layer  at  each  node  u  maintains  the  following  system  of  queues: 

1.  Neighbor  Queues:  Each  outgoing  unicast  packet  has  a  next-hop  v  G  nbd{u),  and 
is  enqueued  in  the  queue  corresponding  to  the  appropriate  neighbor  v.  The  queue  at 
node  u  for  neighbor  v  is  denoted  by  Qnbr{u,v),  while  the  length  of  this  queue  in  bits 
is  denoted  by  qnbr{u,v),  and  the  length  in  packets  is  denoted  by 

2.  Channel  Queues:  There  is  a  pair  of  queues  for  each  channel  c  such  that  some 
interface  of  u  can  tune  to  c.  These  contain  packets  that  have  already  been  bound  to 
channel  c  (i.e.,  these  packets  will  be  sent  on  channel  c).  The  first  of  these  is  meant  to 
temporarily  hold  high-priority  packets  (LL  control  packets,  ARP  packets  and  routing 
packets).  We  shall  refer  to  this  as  the  high  priority  holding  buffer  for  the  channel.  All 
other  packets  are  enqueued  in  the  second  queue.  We  shall  refer  to  this  as  the  channel 
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Algorithm  2  R-Channel  Selection  Algorithm  (Interface  x  at  Node  u) 

S  ^  V{x) 

if  last  packet  on  c{x)  received  more  than  M  seconds  ago  then 
5  ^S\{c{x)} 

end  if 

for  all  6  G  5  do 

if  6  G  c{Mr{u)  \  {x})  then 
S^S\{h} 

end  if 
end  for 
if  5  =  (/>  then 

evict  first  channel  in  pool;  replace  with  any  channel  d  that  is  not  current  channel 
of  another  R-interface 
return  d 

if  no  such  channel  found,  deactivate  interface  x 

end  if 

for  all  6  G  5  do 

compute  Cost{h) 

end  for 

b  <—  argmin  Cost{c) 

CScS 

if  c(x)  G  S  then 

if  Incoming{x,c{x))  <  Scomb  and  p{x)  <  5comb  and  p{x)  <  6comb  then 

I  try  to  do  a  combinatorial  channel  selection  instead  of  a  cost-based  one) 

B  ^  V{x) 

I{x)  ^  the  set  of  all  R-interfaces  of  nodes  in  nbd2{u). 
for  all  d  G  S  do 

n{d)  ^  \{y\y  G  I{x),c{y)  =  d}\ 

end  for 

for  all  d  G  do 

if  (d  ^  c{Mii{u))  A  (n(d)  <  n(c(x)))  then 

P  ^  n(c(x)) 

R  <—  random  number  uniformly  distributed  between  0  and  1 

if  R  <  p  then 
return  d 
end  if 
end  if 
end  for 
end  if 

if  {Ceinc{x,  b)  +  Cifcix,  b)  +  Ciinc{x,  b))  >  1.0  then 
return  c(x) 

end  if 

if  InCOming{x ,  c{x))  <  6min  or  Cost{c{x))  =  0  or  Cost{b)  >=  {Cost{c{x))  —  Sinertia) 

then 

return  c(x) 

end  if 
end  if 

return  b 
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&  Local  Interface  Conflict  Score 

Figure  6.4:  Structure  of  Scheduling  Module 

queue,  and  denote  this  queue  for  channel  c  at  node  u  by  Qch{u,  c),  with  the  length  in 
bits  denoted  by  qch{u,  c).  The  length  in  packets  is  denoted  by  q^f^{u,  c). 

3.  Interface  Queues:  There  is  a  queue  for  each  interface  x,  containing  packets  that  have 
already  been  bound  to  the  interface  x,  and  are  awaiting  their  turn  for  transmission  by 
interface  x.  The  queue  for  an  interface  x  is  denoted  by  Qif(x)  and  the  queue- length 
is  denoted  by  qif{x). 

Handling  Multi-Channel  Broadcast 

Currently,  we  adopt  a  very  simple  approach  to  broadcast.  The  node  v  sends  a  copy  of  each 
broadcast  packet  on  all  channels  that  can  be  operated  on  by  at  least  one  of  its  interfaces. 

High  Priority  Packets 

Broadcast  packets  have  higher  priority  than  unicast  packets  since  typically  most  of  these 
are  expected  to  be  link  layer  or  network  layer  control  packets.  Whenever  the  link  layer 
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receives  a  broadcast  packet  for  sending,  it  creates  a  copy  of  this  packet  for  each  channel 
and  enqueues  it  in  the  high-priority  holding  buffer  of  that  channel. 

High-priority  unicast  packets  are  handled  as  follows:  if  the  next-hop  node  (MAC  des¬ 
tination)  for  the  packet  is  v,  the  packet  is  enqueued  in  the  high  priority  holding  buffer  of 
the  channel  with  highest  effective  rate  that  can  be  used  to  reach  that  neighbor  (i.e.,  c{z), 
where  z  =  argmax  r{u,  v,  c{y))). 

y&T  {u,v) 

Link  layer  control  packets  also  have  high  priority  (note  that  while  LLINFO  is  broadcast, 
QINFO  and  CINFO  are  unicast).  Whenever  the  link  layer  generates  a  control  packet  to 
send,  it  does  the  following:  LLINFO  is  processed  in  the  same  way  as  other  broadcast  packets, 
QINFO /CINFO  from  u  to  u  are  processed  like  any  other  high  priority  unicast  packet. 

When  a  routing  protocol  is  in  use,  it  is  desirable  that  any  unicast  routing  control  packets 
should  also  be  given  priority. 

The  CH-scheduler  determines  how  packets  will  be  transferred  from  the  Neighbor  Queues 
to  the  Channel  Queues,  while  the  IF-scheduler  determines  how  packets  will  be  transferred 
from  the  Channel  Queues  to  the  Interface  Queues. 

Channel  Binding 

The  CH-scheduler’s  state  at  any  instant  is  either  blocked  or  unblocked. 

1.  Initially,  the  state  is  unblocked. 

2.  Whenever  the  link  layer  receives  a  new  packet  of  regular  priority  to  send  from  upper 
layers  then,  after  enqueuing  the  packet  in  the  appropriate  neighbor-queue,  it  invokes 
the  CH-scheduler. 

3.  If  an  invocation  of  the  IF-scheduler  results  in  a  non-empty  channel-queue  becoming 
empty,  the  CH-scheduler  is  invoked  after  ensuring  that  its  state  is  unblocked  (i.e.,  if 
the  state  is  blocked,  it  is  set  to  unblocked).  This  is  also  described  later  in  Section 

6.5.3. 

4.  Whenever  the  CH-scheduler  is  invoked: 

(a)  If  the  state  is  blocked,  nothing  is  done. 

(b)  If  the  state  is  unblocked,  the  channel-binding  routine  (Algorithm  3)  is  executed. 
After  the  exceution  of  the  channel-binding  routine: 
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i.  If  any  channel-queue  is  still  empty,  the  state  remains  unblocked,  else  it  is  set 
to  blocked. 

ii.  The  IF  scheduler  is  invoked. 

We  now  explain  the  intuition  behind  the  channel  binding  routine. 

A  channel-queue  is  said  to  be  eligible  for  scheduling  an  invocation  of  the  CH-scheduler 
if  the  occupancy  of  that  queue  at  the  time  the  scheduler  was  invoked  is  below  a  certain 
threshold  CQ.THRESH=PKT_QUANTUM.  Deeming  queues  with  more  than  CQ.THRESH 
packets  ineligible  helps  facilitate  the  objective  of  late-binding.  Queues  can  also  be  deemed 
ineligible  if  their  local  conflict  score  is  more  than  CQ-THRESEl. 

Consider  the  set  of  all  eligible  neighbor-queues  at  node  u.  Each  has  a  certain  next-hop 
node  (MAC  destination)  v  for  which  there  is  a  set  of  valid  interfaces  T{u,v)  C  and 

correspondingly  a  set  of  possible  channels  Tc{u,v)  =  {c{y)\y  G  T{u,v)}. 

Since  the  channel- assignment  has  already  attempted  to  factor  in  the  traffic-awareness, 
it  is  now  reasonable  to  treat  the  link-layer  packet  scheduling  problem  as  an  independent 
local  decision.  From  the  perspective  of  the  link-layer  at  node  u,  each  packet  enqueued  in 
the  set  of  neighbor-queues  has  a  next-hop  node  from  amongst  u’s  neighbors  to  which  it  has 
to  send  the  packet.  Thus,  the  link-layer  treats  the  local  packet  scheduling  problem  as  if  it 
were  a  problem  involving  single-hop  flows. 

We  draw  intuition  from  the  Dynamic  Backpressure  Scheduler  of  Tassiulas  and  Ephremides 
[110].  In  a  scenario  where  all  flows  traverse  only  a  single-hop,  a  scheduler  which  activates 
links  in  a  manner  than  maximizes  ^  qiri  is  throughput-optimal  (assuming  the  traffic  load 
falls  within  the  network’s  stability  region).  In  our  scheduling  scenario,  we  can  treat  each 
valid  (neighbor,  channel)  pair  as  a  link,  and  define  a  conflict  between  two  pairs  if  they 
have  the  same  channel.  Trying  to  map  the  algorithm  of  [110]  directly,  one  might  consider 
trying  to  assign  packets  from  various  eligible  queues  to  channels,  such  that  the  assignment 
maximizes  ^  qpRp,  where  qp  is  the  length  of  the  neighbor-queue  from  which  the  packet  p  is 
taken,  and  Hp  is  the  net  datarate  of  the  link-channel  pair  over  which  p  is  scheduled. 

However,  in  practice,  this  can  lead  to  long  delays  and  possible  starvation  for  some  flows 
(especially  if  some  flows  are  aggressive  and  inelastic).  Additionally,  from  considerations 
of  amortization,  it  may  be  desirable  to  transfer  packets  from  the  neighbor-queues  to  the 
channel-queues  in  certain  quanta.  An  alternative  approach  might  consist  of  selecting  a  set 
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Q  of  (neighbor,  channel)  pairs  that  maximize  ^  Age{v)fi{u,v,c),  where  Age{v)  is  the  age 
of  the  HOL  (and  hence  oldest,  as  the  neighbor  queues  are  FIFO)  packet  of  the  queue  for 
neighbor  v.  This  gives  priority  to  packets  that  have  been  waiting  longer,  and  thus  improves 
fairness  characteristics.  At  the  same  time,  it  does  not  completely  deviate  from  the  intuition 
behind  the  throughput-optimal  dynamic  backpressure  scheduler  described  in  [110],  since  a 
FIFO  queue  that  has  been  consistently  large  in  the  recent  past  is  also  likely  to  have  an 
HOL  packet  of  large  age.  We  adopt  a  similar  approach. 

The  channel-binding  procedure  is  described  in  Algorithm  3.  Note  that  comparison 
between  ordered  pairs  zi  =  {wi,ri)  and  Z2  =  {w2,  r2)  is  defined  as  zi  >  Z2  if  either  wi  >  W2 
or  rci  =  W2  and  ri  >  r2;  2:1  =  Z2  if  zi  ^  zi  and  Z2^  zi. 

Interface  Binding 

The  interface  binding  (IF)  scheduler’s  state  at  any  instant  is  either  blocked  or  unblocked. 

1.  Initially,  the  state  is  unblocked. 

2.  Whenever  the  link  layer  receives  a  new  broadcast  packet,  or  a  high  priority  unicast 
packet  to  send  (either  a  LL  control  packet,  or  from  upper  layers)  then,  after  enqueuing 
the  packet  in  the  appropriate  channel-queue  (as  described  in  Section  6.5.3),  it  invokes 
the  IF-scheduler. 

3.  Whenever  an  interface-queue  becomes  empty,  a  link-layer  callback  is  invoked,  which 
sets  the  state  of  the  IF-scheduler  to  unblocked,  and  invokes  it. 

4.  The  IF-scheduler  is  also  invoked  after  any  invocation  of  the  CH-scheduler  (as  described 
in  Section  6.5.3). 

5.  Whenever  the  IF-scheduler  is  invoked; 

(a)  If  the  state  is  blocked,  nothing  is  done. 

(b)  If  state  is  unblocked,  the  interface-binding  routine  (Algorithm  4)  is  executed. 
After  the  execution  of  the  interface-binding  algorithm: 

i.  If  there  is  no  available  interface  y  such  that  qif{y)  =  0,  the  IF-scheduler’s 
state  is  set  to  blocked. 
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Algorithm  3  Channel  Binding  Algorithm  (Node  u) 

CQ^THRESH  ^  PKT^QUANTUM 
for  all  V  G  nbd{u)  do 
Tciu,v)^  U  c{y) 

yG'T  {u,v) 

end  for 

•S  ^  U  {{v}x%{u,v)) 

v£nbd{u) 

for  all  {v,c)  €  S  do 

if  q^^{u,c)  >  CQ-THRESH  or  xiocai{u,c)  >  CQ-THRESH  or  ij,{u,v,c)  =  0  then 

end  if 
end  for 

for  all  {v,c)  €  S  do 

if  Qnbr{u,v)  =  0  then 

w{v,  c)  =  0 

else 

Age{v)  <—  time  in  queue  spent  by  HOL  packet  of  Qnbr{u,v) 
w{v,  c)  <—  Age{v)ii{u,  v,  c) 
r'{v,  c)  <—  V,  c) 

end  if 
end  for 

while  S  ^  (j)  do 

{z,d)  <— argmax  {w{v,c),r'{v,c)) 
s 

if  Qnbr{u,  2:)  =  0  then 

continue 

end  if 

Transfer  min{qP{u,  z),  RKT^QUANTUM}  packets  from  Qnbr{u,z)  to  Qch{u,d) 
for  all  {w,b)  G  S  such  that  6  =  d  do 
S^S\{{w,b)} 

end  for 

for  all  {w,b)  G  S  do 

if  Xiocaiiw,b)  >  CQ^THRESH  then 

S^S\{{w,b)} 

end  if 
end  for 
end  while 
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ii.  If  some  initially  non-empty  channel  queue  became  empty  as  a  result  of 
packet-transfer  during  interface  binding: 

•  If  the  CH-scheduler’s  state  is  blocked,  it  is  changed  to  unblocked. 

•  The  CH-scheduler  is  invoked. 

Note  that  an  interface  is  deemed  to  be  available  for  scheduling  by  the  IF-scheduler  if 
it  is  neither  off  nor  in  the  process  of  switching.  Also  note  that  the  interface-queue  lengths 
may  change  in  the  course  of  execution  of  the  procedure,  as  packets  get  transferred. 

Interface  Queues 

Once  a  packet  has  been  transferred  to  an  interface  queue,  the  link  layer  relinquishes  control 
over  it  (except  for  possibly  triggering  a  flushing  of  packets  from  the  interface-queue  in  case 
of  a  channel-switch).  Whenever  an  interface-queue  becomes  empty,  a  link-layer  callback  is 
invoked,  which  sets  the  state  of  the  IF-scheduler  to  unblocked,  and  invokes  it. 

6.6  Evaluation 

The  ns-2  simulator  (version  2.31)  [46]  has  been  used  as  the  codebase,  with  substantial 
modifications  to  the  physical  layer  and  node  models.  A  SINK  threshold  based  model  is 
used,  whereby  a  packet  is  received  successfully  if  it  is  received  at  a  power-level  equal  to  or 
greater  than  the  receiver  sensitivity,  and  the  SINK  is  equal  to  or  greater  than  the  SINK 
threshold.  While  this  leads  to  a  0/1  model  of  packet  reception,  and  does  not  capture  the 
relationship  between  SINK  and  BER,  it  provides  a  reasonable  approximation  for  evaluation 
of  a  link  layer  channel  and  interface  management  scheme.  Cumulative  interference  has  been 
modeled,  and  the  total  received  power  at  an  interface  used  in  SINK  determination  is  the 
sum  of  the  received  powers  from  all  packets  on  the  air  in  that  channel  at  that  instant,  as 
well  as  a  small  thermal  noise  component  (which  is  constant  for  any  given  channel). 

Various  rate-specific  parameter  values  used  in  the  evaluation  are  listed  in  Table  6.6. 
The  RX-sensitivity  values  are  obtained  from  the  specifications  of  the  Cisco  Aironet  NIC, 
while  the  SINR  threshold  values  are  from  [126].  A  fixed  transmission  power  of  65  mW  is 
used.  A  data  payload  size  of  1450  bytes  is  used  for  all  data  packets  sent.  No  MAC-layer 
fragmentation  is  performed.  The  carrier-sense  threshold  is  set  to  -108  dBm  (the  physical 
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Algorithm  4  Interface  Binding  Algorithm  (Node  u) 

{First  we  handle  high  priority  packets} 
for  all  X  G  Mfi{u)  do 

if  qif{x)  =  0  and  x  is  available  then 

transfer  packets  from  high  priority  holding  buffer  of  c(x)  to  Qif{x) 
till  either  former  is  empty,  or  latter  is  full 
end  if 
end  for 

C  <—  set  of  all  available  channels 
for  all  c  G  C  \  c{Mr{u))  do 

for  all  x  G  Mt{u)  do 

if  X  can  operate  on  c  and  qif{x)  =  0  and  x  is  available  then 

transfer  packets  from  high  priority  holding  buffer  of  c  to  Qif{x) 
till  either  former  is  empty,  or  latter  is  full 

end  if 
end  for 
end  for 

(Next  we  handle  regular  priority  packets} 
for  all  X  G  Mfi{u)  do 

if  qif{x)  =  0  and  x  is  available  then 
if  qch{u,  c{x))  >  0  then 

Transfer  m.in{q^j^{u,c{x)),  PKT_QUANTUM}  packets  from  Qch{u,c{x))  to  Qif{x) 

end  if 
end  if 
end  for 

S{c)  is  the  set  of  T-interfaces  of  u  that  can  operate  on  channel  c 
S  ^  {(c,x)|c  G  C  \  c{Mr{u)),x  G  5(c)} 

for  all  (6,  x)  G  5  do 

w{b,  x)  <—  time  in  queue  spent  by  HOT  packet  of  Qchiu,  b) 
s\b,x)  < - l>i:(time  to  switch  from  c{x)  to  b) 

if  {qif{x)  >  0)  or  (3  y  G  Mt{u)  such  that  c(y)  =  b  and  qif{y)  >  0)  then 
S^S\{{b,x)} 

end  if 
end  for 

while  S  ^  (j)  do 

{d,y)  <— argmax  {w{b,  x),  s' {b,  x)) 

s 

if  qch{u,  d)  =  0  then 
continue 

end  if 

Transfer  m.m{q^f^{u,d),  PKT.QUANTUM}  packets  from  Qch{u,d)  to  Qif{y) 
for  all  (6,  x)  G  5  such  that  x  =  y  do 
S^S\{{b,x)} 

end  for 

for  all  (6,  x)  G  5  such  that  b  =  d  do 
S^S\{{b,x)} 

end  for 
end  while 
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Rate 

RX  Sensitivity 

SINR  Threshold 

1  Mbps 

-94  dBm 

-2.92  dB 

2Mbps 

-93  dBm 

1.59  dB 

6  Mbps 

-87  dBm 

6.02  dB 

Table  6.2;  Simulation  Parameters 


carrier-sense  function  deems  the  channel  idle  if  the  received  power  (not  considering  the 
thermal  noise  component)  is  less  than  the  carrier-sense  threshold;  thus,  the  stated  carrier- 
sense  threshold  should  be  interpreted  as  the  power  that  must  be  received  over  and  above 
the  thermal  noise  to  deem  the  channel  busy).  The  threshold  is  deliberately  chosen  to  be 
much  smaller  than  the  receiver  sensitivity  values,  as  the  resultant  carrier-sense  range  is  well 
beyond  2  hops  in  our  test  topologies,  and  this  allows  us  to  evaluate  the  effectiveness  of  the 
protocol  in  performing  channel  management  with  explicit  information  from  2  hops,  when 
channel  conflicts  extend  beyond  this  range.® 

For  TCP  simulations,  the  TCP  Sackl  agent  in  ns-2  is  used.  The  initial  timeout  value 
has  been  changed  from  the  ns-2  default,  and  set  to  1.0s. 

The  protocol  has  been  evaluated  using  a  set  of  test  topologies,  which  involve  various 
different  kinds  of  interface  configurations  and  traffic  patterns,  and  facilitate  understanding 
of  the  strengths  and  weaknesses  of  the  protocol.  Each  plotted  point  on  the  graphs  is  an 
average  of  30  independent  runs,  and  the  95%  confidence  intervals  are  also  plotted. 

In  all  the  simulations,  we  have  an  initial  quiescent  period  of  40s  duration  to  allow  the 
pool-membership  to  stabilize,  before  any  data  transmissions  begin.  The  maximum  length 
of  any  data  session  in  the  simulations  is  10s.  We  have  intentionally  chosen  a  short  data 
session  length,  as  this  poses  a  more  difficult  case  for  the  protocol,  which  must  be  able  to 
adapt  to  the  traffic  at  a  sufficiently  fast  pace  to  provide  improved  performance  with  short 
session  lengths. 

6.6.1  Test  Topologies 

We  use  the  TwoRayGround  propagation  model  for  these  topologies,  as  the  primary  goal 
is  to  study  the  link  layer’s  ability  for  dynamic  adaptation  to  traffic  in  the  presence  of 

®Note  that  in  this  work  we  are  not  concerned  with  choosing  a  carrier-sense  threshold  value  that  is  optimal 
for  performance.  Our  goal  is  only  to  evaluate  the  performance  of  our  protocol  given  some  value  for  this 
parameter,  and  a  large  carrier-sense  threshold  poses  a  more  difficult  case  for  our  protocol. 
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Parameter  Name 

Value 

Tllinfo 

0.5s 

Jllinfo 

1.0s 

TqinFO 

0.75s 

Jqinfo 

0.25s 

^pool 

4.0s 

T 

^  rassign 

0.75s 

Jr  assign 

1.0s 

NBR.TTL 

10.0s 

IFR^TTL 

10.0s 

K 

1000  (bits) 

H 

0.01s 

^inertia 

0.1 

T 

rassign 

Smin 

crr^ 

T 

rassian 

Scomb 

0.01 

Table  6.3:  Protocol  Parameter  Values  Used  in  Simulations 


heterogeneous  radios/channels.  Results  using  the  probabilistic  Shadowing  model  over  some 
random  topologies  are  discussed  in  Section  6.6.2.  For  the  choice  of  simulation  parameters 
used,  the  TwoRayGround  model  yields  an  802.11a  transmission  range  of  approximately 
630-640m,  and  an  802. llg  transmission  range  of  approx.  900m.  The  carrier-sense  range 
is  approximately  2130-2140m,  which  is  greater  than  3  hops  for  802.11a  transmissions,  and 
marginally  greater  than  2  hops  for  802. llg  transmissions.  Note  that  the  ranges  obtained 
with  the  TwoRayGround  model  for  the  chosen  parameter  settings  is  larger  than  what  one 
typically  sees  in  practice;  however  the  absolute  value  of  the  transmission  range  is  not  very 
significant  for  our  evaluation.’^ 

While  discussing  the  simulation  results,  we  will  sometimes  refer  to  the  number  of  chan¬ 
nels  as  c  and  the  poolsize  as  /.  Whenever  we  show  per-flow  throughput  and  the  session- 
durations  of  different  flows  are  different,  the  throughput  of  each  flow  is  computed  as  to¬ 
tal  amount  of  useful  data  received  at  the  flow  destination  divided  by  that  flow’s  session- 
duration.  Whenever  we  show  aggregate  throughput,  if  the  session-durations  are  different 
for  different  flows,  the  aggregate  throughout  is  computed  as  total  amount  of  useful  data 

^However,  it  is  to  be  noted  that  the  larger  propagation  delays  do  have  a  small  effect  on  the  possibility  that 
two  nodes  within  carrier-sense  range  sense  the  channel  to  be  idle  at  around  the  same  time.  This  sometimes 
causes  a  few  packet  losses  due  to  collisions.  However,  given  the  low  data  rates  (and  hence  less  stringent 
SINK  requirements),  for  certain  relative  locations  of  nodes,  this  can  sometimes  even  improve  throughput 
marginally. 
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received  at  all  flow  destinations  divided  by  the  maximum  session-duration. 

Multiple  independent  runs  for  each  data  point  were  obtained  by  seeding  the  defaultRNG 
object  in  ns2  with  a  single  selected  seed,  and  then  calling  the  next-substream  command  i 
times  for  the  i-th  run. 

Topology  1  The  topology  is  depicted  in  Fig.  6.5.  9  nodes  are  arranged  in  a  3  by  3  grid 
(the  side  of  each  grid  square  is  500m).  Each  node  has  one  R-Interface  and  one  T-interface  of 
type  802.11a.  There  are  3  CBR  flows:  0  ^  1  at  rate  approx.  5.8  Mbps  starts  at  t  =  40.0s, 
0  ^  3  at  rate  approx.  5.8  Mbps  starts  at  t  =  40.5s,  2  ^  5  at  rate  approx.  2.9  Mbps  starts 
at  t  =  40.6s,  8  ^  7  at  rate  approx.  2.9Mbps  starts  at  t  =  40.9s.  All  flows  run  till  end 
of  simulation  at  t  =  50.0s.  The  topology  is  of  interest  as  it  involves  both  interface  and 
interference  conflicts.  Note  that  an  ideal  scheduler  can  meet  almost  all  the  traffic  demand 
with  just  3  channels,  by  assigning  one  channel  to  the  R-interface  of  0  and  either  of  1  or 
3,  assigning  the  second  channel  to  the  remaining  node  from  amongst  1,3,  and  assigning 
the  third  channel  to  5  and  7.  We  evaluate  the  following  (number  of  channels,  poolsize) 
combinations:  (1, 1),  (3, 1),  (12, 1),  (3,  3),  (12,  3),  (12, 12). 

The  throughput  results  are  depicted  in  Fig.  6.6.  Note  that  a  poolsize  of  3  typically 
yields  better  performance  than  a  poolsize  of  1  for  the  same  number  of  channels.  It  is  also 
interesting  to  note  that  with  12  channels  and  poolsize  3,  the  throughput  is  lower  than  the 
throughput  with  3  channels  and  poolsize  3.  The  reason  for  this  is  that  there  is  an  interface- 
conflict  that  arises  at  node  0,  as  it  has  only  one  T-interface  but  is  generating  data  for  both  1 
and  3  at  ~  5.8  Mbps  each.  Hence  it  is  desirable  to  have  the  R-interface  of  0  and  one  of  1  and  3 
on  the  same  channel  (so  that  0  can  use  its  R-interface  for  transmission),  while  the  T-interface 
is  used  to  transmit  packets  to  the  remaining  node  on  another  channel.  The  interface-conflict 
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Topology  1 :  CBR  Traffic 


dst  1  IX  X  X  r 
dst  3 
dst  5 
dst  7 

dst  1  1'.  X  •;  I 
dst  3  ixXXxxxi 

dst  5 

dst  7  r,  -  I 
dst  1  t.  X  X  ,1 
dst  3  CZZ32ZI 

dst  5 
dst  7 


component  of  the  channel  cost  metric  does  try  to  capture  this;  however,  sometimes  the 
receiver’s  R-interface  cannot  change  its  assignment  to  address  interface  conflicts  as  the 
transmitter’s  R-channel  may  not  be  in  the  pool  of  the  receiver’s  R-channel.  This  leads  to 
the  observed  inversion  scenario.  It  can  potentially  be  addressed  by  additional  signaling 
leading  to  pool-adjustment,  but  the  extra  complexity  may  not  be  justified  if  such  scenarios 
are  not  very  common.  Our  justification  that  the  inversion  phenomenon  is  being  caused  by 
channel-restriction  is  borne  out  by  the  fact  that  with  (12, 12),  the  throughput  is  almost  the 
same  (actually  slightly  better)  that  with  (3,3).  The  inability  to  address  interface-conflicts 
is  also  the  cause  of  the  inferior  performance  with  (3, 1)  and  (12, 1). 

The  key  observation  is  that  (3,3)  and  (12,12)  provide  close-to-best-possible  perfor¬ 
mance. 


Topology  2  8  nodes;  0,  1,  ...,  7,  are  arranged  in  a  linear  chain.  The  separation  between 

adjacent  nodes  is  500m.  Each  node  is  equipped  with  an  802.11a  R-interface  and  an  802.11a 
T-interface. 

For  K  =  1,2,...,  7:  We  start  a  single  R'-hop  flow  from  node  0  to  node  K  at  time 
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Figure  6.7:  Topology  2  (Chain) 


t  =  40.0s,  which  is  active  till  the  end  of  simulation  at  t  =  50. sO. 

Fig.  6.8  shows  the  throughput  when  the  flow  comprises  CBR  (UDP)  traffic  (generated 
at  approx.  5.8  Mbps).  For  a  given  number  of  channels,  setting  poolsize  (/)  to  3  yields 
better  performance  that  /  =  1.  This  is  because  when  /  =  1,  the  channel- assignment 
criterion  is  solely  the  number  of  interfaces  on  that  channel  within  2  hops.  With  a  carrier- 
sense  range  larger  than  2,  this  may  not  always  achieve  good  load-balance.  Even  when 
the  number  of  channels  is  large,  e.g.,  c  =  12,  despite  the  high  probability  of  interfering 
interfaces  having  different  channels  due  to  sheer  randomization,  there  tend  to  be  a  few 
cases  where  the  channel- assignment  is  bad,  and  this  degrades  the  average  throughput.  This 
also  explains  the  greater  variability  (the  confidence  intervals  are  larger)  with  /  =  1.  When 
/  =  3,  the  previously  described  channel  cost  metric  is  used,  which  includes  a  contention-cost 
component  that  is  able  to  capture  high  channel  load.  Thus,  even  if  the  channel- assignment 
is  sub-optimal  at  the  time  the  flow  starts,  dynamic  adaptation  to  the  load  occurs,  and  we 
get  better  performance. 

Fig.  6.9  shows  the  throughput  when  the  flow  comprises  FTP  (TCP)  traffic. 

As  can  be  seen,  the  throughput  with  TCP  shows  a  steady  decrease  as  the  number  of 
hops  increase,  even  with  multiple  channels. 

While  it  is  true  that  the  LL  is  better  able  to  adapt  to  CBR  traffic  as  compared  to  TCP 
(since  CBR  traffic  is  inelastic,  there  is  a  steady  queue  build-up  that  eventually  triggers  chan¬ 
nel  re-assignment),  another  major  reason  for  the  lower  throughput  with  TCP  in  the  chain 
topology  is  the  increased  delay  faced  by  TCP  over  multiple  hops.  As  the  number  of  hops 
to  traverse  increases,  the  round-trip  delay  increases,  which  has  a  detrimental  effect  on  TCP 
throughput. Also  note  that  the  performance  of  almost  all  the  multi-channel  combinations  is 
very  similar,  although  one  can  discern  a  semblance  of  relative  trends  similar  to  the  CBR 
case.  The  lack  of  differentiation  can  be  attributed  to  the  fact  that  the  decline  in  throughput 
as  the  number  of  hops  increase  tends  to  mask  the  differences  due  to  channel-adaptation. 

We  remark  that  the  round-trip  delay  is  substantially  inflated  by  the  fact  that  TCP 
flows  have  bi-directional  traffic  (DATA  and  ACK),  and  thus  at  each  hop  the  DATA  and 
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Topology  6  (Chain):  Throughput  with  CBR  Traffic 
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Figure  6.8:  Topology  2:  CBR  Traffic 


Topology  2  (Chain):  Throughput  with  TCP  Traffic 
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Figure  6.9:  Topology  2:  TCP  Traffic 
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Topology  2  (Chain)  with  Extra  T-interfaoe:  TCP  Traffic 


Number  of  Hops 

Figure  6.10;  Topology  2:  (Extra  T-Interface) :  TCP  Traffic 

ACK  packets  must  share  the  same  T-interface.  As  evidence  of  the  dominant  effect  of  delay 
due  to  DATA/ ACK  contention,  consider  a  variant  scenario  where  we  have  the  same  chain 
topology,  but  each  node  is  equipped  with  an  extra  802.11a  T-interface.  The  thoughput 
results  with  TCP  traffic  are  shown  in  Fig.  6.10.  It  is  evident  from  the  figure  that  the 
decrease  in  throughput  with  increase  in  hops  now  occurs  at  a  much  slower  rate.  A  similar 
experimental  observation  about  the  improvement  in  TCP  when  using  additional  interfaces 
for  sending  was  made  in  the  context  of  the  Net-X  testbed  in  [104]. 

Topology  3  25  nodes  are  arranged  in  a  5  by  5  grid  spatial  layout  (the  side  of  each  grid 
square  is  460m).  Thus,  the  logical  network  topology  is  also  a  5  by  5  grid.  Each  node  is 
equipped  with  one  pair  of  802.11a  interfaces  (one  R-interface  and  one  T-interface).  We 
pre-designate  12  (disjoint)  one-hop  SD  pairs,  as  depicted  in  Fig.  6.11.  We  vary  the  number 
of  channels  c.  If  c  channels  are  in  use,  the  first  c  sources  start  sending  data  at  t  =  40.0s  and 
continue  till  the  end  of  simulation  at  t  =  50.0s.  Thus,  the  number  of  flows  in  any  scenario 
is  the  same  as  the  number  of  channels.  Therefore,  an  ideal  omniscient  scheduler  can  assign 
each  flow  to  a  separate  channel,  and  get  the  full  benefit  of  each  channel,  providing  maximum 
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Figure  6.11:  Topology  3 


Figure  6.12:  Topology  4 


Topology  3;  Aggregate  Throughput  with  CBR  Traftio  Topology  3:  Aggregate  Throughput  with  TCP  Traffic 


Figure  6.13:  Topology  3:  CBR  Traffic 


Figure  6.14:  Topology  3:  TCP  Traffic 


throughout  to  each  flow.  However,  we  have  a  distributed  protocol  where  each  node  only 
has  explicit  information  up  to  a  two-hop  neighborhood,  and  has  reduced  flexibility  due 
to  channel-restriction.  Thus,  this  topology  provides  a  means  of  evaluating  the  efficacy  of 
the  protocol  in  adapting  the  channel  of  an  interface  to  traffic  that  may  extend  beyond  its 
two-hop  neighborhood. 

At  time  t  =  40.0s,  all  c  active  sources  start  sending  to  their  respective  destinations,  and 
continue  to  do  so  till  the  simulation  ends  at  t  =  50.0s. 

Fig.  6.13  depicts  aggregate  throughput  for  CBR  traffic.  Given  c  channels,  a  useful 
benchmark  is  to  compare  the  achieved  throughput  with  c  times  the  single-channel  through¬ 
put.  While  the  difference  between  this  and  what  the  LL  is  able  to  achieve  increases  as  c 
increases,  one  can  see  that  even  with  c  =  12,  the  LL  is  able  to  get  quite  good  performance. 
Also  /  =  3  shows  a  small  but  consistent  performance  gain  over  /  =  1. 

Fig.  6.14  depicts  aggregate  throughput  for  TCP  traffic.  The  relative  trends  are  similar, 
although  the  throughput  obtained  is  lower  than  in  the  case  of  CBR  traffic. 
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Topology  4;  Aggregate  Throughput  with  CBR  Traftio 
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Topology  4:  Aggregate  Throughput  with  TCP  Tratfic 


Figure  6.15:  Topology  4:  CBR  Traffic 


Figure  6.16;  Topology  4:  TCP  Traffic 


Topology  4  (Extra  T-Irterface);  Aggregate  Throughput  with  TCP  Traffic 


Figure  6.17:  Topology  4  with  Extra  T-interface:  TCP  Traffic 


Topology  4  25  nodes  are  arranged  in  a  5  by  5  grid  layout  (the  side  of  each  grid  square 

is  600m).  Thus,  the  logical  network  topology  is  also  a  5  by  5  grid.  Each  node  is  equipped 
with  one  pair  of  802.11a  interfaces  (one  R-interface  and  one  T-interface).  We  pre-designate 
8  (disjoint)  one-hop  SD  pairs,  as  depicted  in  Eig.  6.12,  in  the  two  extreme  columns  of  the 
grid.  All  sources  start  transmitting  at  t  =  40.0s  and  continue  till  the  end  of  simulation  at 
t  =  50.0s.  Given  the  grid-size,  it  can  be  seen  the  all  sources  within  the  same  grid  column 
are  within  each  others’  carrier-sense  range,  but  the  sources  in  different  columns  are  not. 
This  yields  a  spatial  reuse  factor  of  2  for  up  to  4  channels.  Thus,  an  ideal  scheduler  needs 
just  4  channels  to  be  able  to  concurrently  schedule  the  flows.  We  evaluate  the  efficacy  of 
the  protocol  in  handling  this  situation. 

Eig.  6.15  depicts  the  aggregate  throughput  when  all  flows  comprise  CBR  traffic  at  rate 
approx  5.8  Mbps  each.  As  can  be  seen,  even  with  just  4  channels,  the  performance  with 
poolsize  3  is  very  close  to  what  we  would  expect  from  an  ideal  scheduler.  A  poolsize  of  1 
with  4  channels  yields  a  performance  that  is  moderately  but  not  drastically  inferior  to  using 
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Figure  6.18:  Topology  5 

a  poolsize  of  3.  With  8  channels,  the  performance  is  almost  the  same  for  poolsize  3  (as 
the  performance  of  poolsize  3  with  4  channels  is  already  close  to  the  best  possible,  there  is 
little  margin  for  improvement).  However,  with  8  channels,  poolsize  1  also  performs  almost 
as  well,  since  the  number  of  channels  is  sufficiently  larger  than  the  number  of  mutually 
conflicting  flows. 

Fig.  6.16  depicts  the  aggregate  throughput  when  all  flows  comprise  FTP  traffic.  In  this 
case,  we  see  that  with  4  channels,  the  throughput  is  little  better  than  twice  the  through¬ 
put  with  1  channel.  Increasing  the  number  of  channels  to  8  yields  only  marginal  gain. 
Once  again,  we  remark  that  the  LL  is  less  effective  in  dynamically  adapting  the  channel 
assignment  to  TCP  traffic,  and  this  can  explain  the  lower  throughput  with  TCP  to  some 
extent.  However,  the  rather  poor  performance  with  TCP  is  also  due  to  the  fact  that  the 
flow-endpoints  are  not  disjoint.  As  can  be  seen,  the  destination  of  flow  1  is  the  source  of 
flow  2,  and  so  on.  Resultantly,  these  nodes  have  to  share  their  T-interface  between  DATA 
for  one  flow,  and  ACK  for  another.  Thus,  the  phenomenon  is  similar  to  what  we  discussed 
in  the  context  of  the  chain  topology.  To  verify  this,  we  equipped  each  node  with  an  extra 
T-interface,  and  performed  the  simulation  for  FTP  traffic.  The  aggregate  throughput  is 
depicted  in  Fig.  6.17,  and  shows  substantial  improvement  over  the  previous  case. 

Topology  5  This  topology  (Fig.  6.18)  helps  evaluate  how  the  link  layer  schedules  pack¬ 
ets  over  different  channels  and  interfaces,  given  multi-hop  flows  with  routes  specified  as 
sequences  of  nodes.  9  nodes  are  arranged  in  a  3  by  3  grid  layout  (the  side  of  each  grid 
square  is  500m).  Thus,  the  802.11a  induced  topology  is  a  3  by  3  grid,  but  the  802. llg  links 
span  diagonals.  Each  node  has  one  R-Interface  and  one  T-interface  of  each  type  802.11a 
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Topology  5:  CBR  Traffic 


Figure  6.20:  Topology  5:  CBR  Traffic 


and  802. llg. 

There  are  3  flows;  0^7  with  manually  specified  route  0^3^6^7,  3^5  with 
manually  specified  route  3  ^  4  ^  5,  and  2^8  with  manually  specified  route  2  ^  5  ^  8. 

In  the  CBR  traffic  case,  the  traffic  generation  rates  are  :  0  ^  7  at  rate  approx.  5.8 
Mbps  ,  3  ^  5  at  rate  approx.  2  Mbps,  and  2  ^  8  at  rate  approx.  5.8  Mbps.  Note  than  an 
ideal  scheduler  can  meet  almost  all  the  traffic  demand  with  just  5  802.11a  channels,  and  2 
802. llg  channels. 

We  evaluate  performance  with  the  following  combinations  of  (number  of  802.11a  chan¬ 
nels,  number  of  802. llg  channels,  poolsize):  (1, 1, 1),  (6, 3, 1),  (6,  3,  3),  (12, 3, 1),  (12, 3, 3). 

Fig.  6.20  depicts  the  per-flow  throughput  with  CBR  flows.  It  can  be  seen  that  (6, 3, 3) 
and  (12, 3, 3)  perform  very  well,  and  yield  throughput  fairly  close  to  what  we  would  expect  in 
the  best  case.  This  indicates  that  the  LL  is  able  to  adjust  the  channel  assignment  as  per  the 
traffic,  and  is  also  able  to  distribute  packets  across  the  different  types  of  interfaces/channels 
in  a  reasonable  manner.  For  the  same  number  of  channels,  the  performance  with  /  =  1  is 
inferior  to  that  with  /  =  3,  due  to  lack  of  dynamic  R-channel  adaptation. 

Fig.  6.21  depicts  the  throughput  when  all  the  3  flows  comprise  FTP  traffic.  It  can 
be  seen  that  the  throughput  is  lower  than  the  CBR  case,  which  is  to  be  expected  as  we 
have  multi-hop  TCP  flows.  All  multi-channel  combinations  have  similar  performance,  as 
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Topology  5:  TCP  Traffic 


Figure  6.21:  Topology  5:  TCP  Traffic 


any  differences  are  likely  masked  by  the  degradation  in  TCP  throughput  due  to  traversing 
multiple  hops. 

Topology  6  This  simple  topology  (Fig.  6.19)  illustrates  in  detail  how  the  link  layer 
handles  packet  scheduling,  when  there  are  neighbors  with  different  interface  types,  including 
multimode  T-interfaces.  The  topology  comprises  a  single-hop  network  of  3  nodes  0,  1,  2. 

We  consider  the  following  variant  scenarios: 

1.  Topology  6.1:  0  and  1  have  one  R-interface  and  one  T-interface  each  of  type  802.11a 
and  802. llg.  2  has  one  802. llg  R-interface  and  1  802. llg  T-interface.  One  flow: 
0^1.  Two  traffic  scenarios  are  considered:  (i)  approx.  7.73  Mbps  CBR  (ii)  FTP 

2.  Topology  6.2:  0  and  1  have  one  R-interface  and  one  T-interface  each  of  type  802.11a 
and  802. llg.  2  has  one  802. llg  R-interface  and  1  802. llg  T-interface.  Two  flows:  (i) 
0  ^  1  at  approx.  5.8  Mbps  CBR  ,  0  ^  2  at  approx.  1.93  Mbps  CBR  (ii)  0^1  FTP 
and  0^2  FTP 

3.  Topology  6.3:  0  and  1  have  one  802.11a  R-interface,  one  802. llg  R-interface,  and  1 
802. flag  T-interface.  2  has  one  802. llg  R-interface  and  1  802. llg  T-interface.  One 
flow:  0^1.  Two  traffic  scenarios  are  considered:  (i)  approx.  7.73  Mbps  CBR  (ii) 
FTP 
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Topology  6.1 ;  Throughput  with  CBR  Traffic 


Topology  6.1 :  Throughput  with  TCP  Traffic 


(No.  of  802.1  la  charnels,  No.  of  802.1 1b  charnels,  Poolsize)  (No.  of  802.11a  channels.  No.  of  802.1 1b  channels,  Poolsize) 

Figure  6.22:  Topology  6.1:  CBR  Traffic  Figure  6.23:  Topology  6.1:  TCP  Traffic 

4.  Topology  6.4-'  0  and  1  have  one  802.11a  R-interface,  one  802. llg  R-interface,  and  1 
802.11ag  T-interface.  2  has  one  802. llg  R-interface  and  1  802. llg  T-interface.  Two 
flows:  (i)  0  ^  1  at  approx.  5.8  Mbps  CBR  ,  0  ^  2  at  approx.  1.93  Mbps  CBR.  (ii) 
0^1  FTP  and  0  ^  2  FTP 

In  all  the  above  scenarios,  the  0^1  flow  starts  at  t  =  40.0s,  the  0^2  flow  starts  at 
t  =  42.0s  (whenever  applicable),  and  continue(s)  till  the  end  of  simulation  at  t  =  50.0s. 
We  evaluate  performance  with  the  following  combinations  of  (number  of  802.11a  channels, 
number  of  802. llg  channels,  poolsize):  (1, 1, 1),  (3,  3, 1),  (3,  3,  3). 

In  all  the  above  topologies,  the  performance  with  (1,1,1)  is  very  good.  When  there  is 
only  one  channel  of  each  type,  the  LL  of  each  node  deactivates  the  T-interfaces.  Resultantly, 
nodes  0  and  1  effectively  have  1  802.11a  interface  on  the  single  802.11a  channel  and  1 
802. llg  interface  on  the  single  802. llg  channel,  while  node  2  has  one  802. llg  interface  on 
the  one  802. llg  channel.  Thus,  node  0  can  simultaneously  transmit  on  both  channels  to  its 
destination(s). 

In  Topologies  6.1  and  6.2,  node  0  has  one  T-interface  of  each  type,  and  thus  both  the 
multi-channel  combinations  also  have  performance  similar  to  (1, 1, 1). 

When  node  0  has  a  single  multi-mode  T-interface,  and  there  is  a  single  flow  (Topology 
6.3),  (3,3, 1)  exhibits  lower  performance  than  the  other  two  combinations.  The  difference 
is  more  marked  with  CBR  traffic  (Fig.  6.26),  as  compared  to  TCP  traffic  (Fig.  6.27). 
This  is  because  with  (3,3, 1),  the  R-interfaces  are  more  likely  to  be  on  different  channels, 
and  thus  node  0  can  only  use  its  multi-mode  T-interface  to  send  data.  Note  that  the  local 
interface  conflict  score  helps  ensure  that  the  data  is  primarily  sent  using  the  802.11a  channel 
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(1,1,1)  (3.3,1)  (3,3,3)  (1.1,1)  (3,3,1)  (3,3.3) 

(No.  ot  802.1  la  charnels.  No.  ot  802.1 1b  charnels,  Poolsize)  (No.  of  802.11a  channels.  No.  of  802.1 1b  channels,  Poolsize) 


Figure  6.24:  Topology  6.2:  CBR  Traffic 


Figure  6.25:  Topology  6.2:  TCP  Traffic 


Topology  6.3:  Throughput  with  CBR  Traffic 


Topology  6.3:  Throughput  with  CBR  Traffic 


(No.  of  802.11a  channels.  No.  of  802.1 1b  channels,  Poolsize) 


Figure  6.26:  Topology  6.3:  CBR  Traffic 


Figure  6.27:  Topology  6.3:  TCP  Traffic 


on  the  multi-mode  interface.  In  case  of  (1,1,1),  by  default  the  R-interfaces  get  used,  as 
explained  earlier,  and  we  get  the  benefit  of  data-striping  across  2  channels.  With  (3,3,3), 
the  network  is  likely  to  initially  have  the  R-interfaces  on  different  channels,  but  is  able  to 
quickly  adapt  based  on  the  interface  conflict  cost,  and  get  the  benefit  of  data-striping  across 
two  interfaces/channels.  TCP  throughput  is  typically  moderately  lower  than  CBR  traffic, 
and  the  difference  between  (3,  3, 1)  and  (3,  3,  3)  is  not  very  marked.  This  can  be  explained 
by  the  fact  that  TCP  probably  gets  lesser  benefit  from  data-striping  due  to  out-of-order 
delivery  issues. 

When  node  0  has  a  single  multi-mode  T-interface,  and  there  are  two  flows  (Topology 
6.4),  (3,3, 1)  again  exhibits  much  lower  performance  (this  time  for  both  CBR  and  TCP), 
as  there  is  a  smaller  chance  of  R-channel  overlap,  and  thus,  node  0  must  typically  time- 
share  its  T-interface  to  send  to  node  1  and  node  2.  The  other  multi-channel  combinations 
benefit  from  the  R-interfaces,  as  already  explained  above.  Also  note  that  despite  having  to 
contend  for  the  same  interface,  the  two  flows  each  get  reasonable  throughput.  Of  course,  the 
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Figure  6.28:  Topology  6.4:  CBR  Traffic  Figure  6.29:  Topology  6.4:  TCP  Traffic 

throughput  for  destination  2  is  lower,  since  the  packet-scheduler  tries  to  achieve  a  balance 
between  providing  some  fairness  and  getting  the  best  rate  (though  the  two  flows  do  get 
a  reasonably  fair  share  of  interface  time).  If  greater  throughput  fairness  is  needed,  the 
scheduling  rules  can  be  suitably  modified  to  achieve  that. 

Topology  7  9  nodes  are  arranged  in  a  3  by  3  grid  (the  side  of  each  grid  square  is  500m). 
Node  4  has  4  R-interfaces  of  type  802.11a,  and  one  T-interface  of  type  802.11a.  All  other 
nodes  have  one  R-interface  and  one  T-interface  of  type  802.11a.  There  are  4  one-hop  flows 
1^4,  3^4,  5^4,  7^4  which  start  at  times  t  =  40.0s,  40.5s,  41.0s,  41.5s  respectively, 
and  continue  till  end  of  simulation  at  t  =  50.0s.  In  the  CBR  traffic  case,  each  flow  has 
traffic  rate  approx.  5.8  Mbps. 

The  following  combinations  of  (number  of  channels,  poolsize)  were  simulated:  (1,  1), 
(4,  1),  (4,  3),  (12,  1),  (12,  3),  (12,  12). 

This  topologies  are  of  interest  as  it  involves  nodes  with  a  different  number  of  radio¬ 
interfaces.  Moreover,  it  is  representative  of  scenarios  where  node  4  may  be  a  gateway  or 
server  node  which  is  likely  to  be  more  capable  than  others,  and  to  which  much  of  the  traffic 
might  be  directed.  It  is  also  of  interest  as  an  illustration  of  how  various  LL  mechanisms 
complement  and  supplement  each  other. 

The  first  observation  we  make  is  that  an  ideal  scheduler  needs  just  4  channels  to  get 
best-possible  performance  (as  the  receiver  has  4  R-interfaces),  and  it  can  do  so  by  simply 
partitioning  the  4  senders  across  channels.  However,  when  each  sender  independently  de¬ 
cides  which  channel(s)  to  use,  there  is  the  possibility  that  two  or  more  senders  may  try  to 
access  the  same  channel  at  the  same  time.  This  would  create  contention  on  this  channel. 
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Figure  6.30:  Topology  7 


while  some  other  channel  might  be  unutilized. 

Fig.  6.31  shows  the  aggregate  throughput  when  all  the  flows  are  CBR.  Note  that 
all  multi-channel  combinations  give  throughput  that  is  fairly  close  to  the  best-possible 
throughput;  (4, 1),  (4,3)  and  (12, 12)  provide  best  performance,  but  even  (12,3)  and  (12, 1) 
are  only  marginally  inferior.  However,  each  combination  has  some  distinct  characteristics, 
as  we  now  explain. 

Suppose  we  did  not  have  an  interface  conflict  cost  or  an  local  interface  conflict  score. 

Note  that  when  we  have  just  4  channels,  we  would  always  get  very  good  throughput  (for 
both  (4, 1)  and  (4,3)).  The  reason  is  as  follows:  each  of  the  4  R-interfaces  at  node  4  will 
be  on  one  each  of  these  channels.  The  R-interfaces  at  the  senders  must  also  be  on  some  of 
these  channels  (as  there  are  no  other  channels).  Thus,  there  is  bound  to  be  overlap  in  the 
R-channels  of  senders  and  receivers.  This  would  allow  some/all  of  the  senders  to  use  both 
their  interfaces  for  sending  (since  the  LL  performs  data-striping).  Moreover,  there  is  likely 
to  be  at  least  one  active  sending  interface  on  each  channel,  and  we  can  get  good  channel 
utilization  and  throughput. 

Suppose  we  have  an  interface  conflict  cost,  but  do  not  have  a  local  interface  conflict 
score. 

With  (12, 12),  all  the  R-interfaces  are  initially  likely  to  be  on  different  channels,  but  after 
the  data  sessions  start,  if  the  senders  are  not  able  to  send  data  fast  enough,  the  queues  will 
build  up,  and  the  interface-conflict  cost  will  tend  to  lead  the  R-interfaces  of  node  4  to  switch 
to  the  R-channels  of  each  of  the  senders.  With  (12,3)  such  an  adaptation  is  less  likely  to 
happen  (due  to  the  channel  restriction),  and  throughput  would  be  lower. 
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Topology  7:  Aggregate  Throughput  with  CBR  Traffic 


(1,1)  (4,1)  (4,3)  (12,1)  (12,3)  (12,12) 


(No.  of  Channels,  Poolsize) 

Figure  6.31:  Topology  7:  CBR  Traffic 

With  (12, 1),  there  is  a  very  small  chance  of  substantial  R-channel  overlap  a  priori,  and 
there  is  no  traffic-dependent  R-channel  re-assignment  to  aid  this.  Moreover,  in  the  absence 
of  a  local  interface  conflict  score,  the  channel-binding  algorithm  at  each  sender  will  tend 
to  bind  packets  to  many  different  channels  (from  amongst  the  4  choices),  if  the  neighbor- 
queue  is  sufficiently  large  when  the  CH-scheduler  is  invoked.  This  would  lead  to  reduced 
throughput. 

The  use  of  the  local  interface  conflict  score  helps  address  this. 

Note  that  each  sender  has  4  channel  choices  for  sending  data,  but  only  one  T-interface 
(assuming  there  is  no  overlap  in  R-channels).  Thus,  all  these  4  channels  have  a  local  interface 
conflict  with  each  other.  When  the  channel  binding  procedure  is  executed,  packets  will  be 
preferentially  bound  to  the  channel  with  highest  net  datarate,  and  hence  lowest  recent 
contention.  Once  some  packets  have  been  bound  to  this  channel,  the  local  interface  conflict 
would  make  the  other  channels  ineligible.  Therefore,  if  each  sender  were  to  have  a  different 
best  channel,  this  would  lead  to  a  near-partition  of  senders  across  channels.  Note  that  by  the 
very  nature  of  the  net  datarate  statistic,  a  node  that  recently  won  quick  access  to  a  channel 
will  consider  it  a  good  channel,  while  other  senders  are  likely  to  find  it  less  attractive.  This 
is  likely  to  lead  to  the  desired  scenario.  This  explains  the  good  performance  even  with 
(12,1)  and  (12,3). 

Fig.  6.32  shows  the  aggregate  throughput  with  TCP  flows.  The  trends  are  similar  to 
the  CBR  case,  although  the  achieved  throughput  is  generally  lower. 
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Topology  7:  Aggregate  Throughput  with  TCP  Traffic 


(1,1)  (4,1)  (4,3)  (12,1)  (12,3)  (12,12) 


(No.  of  Channels,  Poolsize) 

Figure  6.32:  Topology  7:  TCP  Traffic 

6.6.2  Random  Topologies 

To  get  some  insight  into  performance  in  a  lossy  environment,  we  have  performed  some 
simulations  on  random  topologies  with  the  shadowing  model. 

We  considered  10  static  random  topologies  of  30  nodes  over  a  600mx600m  area.  The 
shadowing  model  in  the  ns-2  simulator  was  used  with  a  path- loss  exponent  of  2.5  and  a 
shadowing  deviation  of  2dB. 

Each  node  is  equipped  with  an  802.11a  R-interface,  an  802.11a  T-interface,  an  802. llg 
R-interface  and  an  802. llg  T-interface.  We  pre-designate  12  nodes  as  potential  sources: 
Si  =  0,  S2  =  2, ....,  si2  =  22.  We  consider  2  channel/traffic  configurations: 

•  1  802.11a  channel,  1  802. llg  channel,  poolsize  1,  referred  to  as  (1, 1, 1).  At  t  = 
40.0s,  Si  chooses  a  random  next-hop  node  as  destination  and  starts  transmitting.  It 
continues  to  do  so  till  the  simulation  ends  at  t  =  50.0s. 

•  12  802.11a  channels,  3  802.11a  channels,  poolsizes  1  and  3,  referred  to  as  (12,3,1) 
and  (12,3,3)  respectively.  At  t  =  40.0s,  all  12  sources  si,...,si2  choose  a  random 
next-hop  node  as  destination  and  start  transmitting.  They  continue  to  do  so  till  the 
simulation  ends  at  t  =  50.0s. 

Thus,  in  each  configuration,  the  number  of  flows  is  the  same  as  the  number  of  802.11a 
channels  in  use. 
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Random  Topologies;  CBR  Traffic 
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Figure  6.33:  Random  Topologies;  CBR 
Traffic 
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Figure  6.34:  Random  Topologies;  TCP 
Traffic 


To  get  a  random  sampling  of  links  from  the  designated  source(s),  the  random  choice 
of  neighboring  destination  is  made  at  runtime  for  each  run,  by  inspecting  the  link  layer 
neighbor-list  of  the  source  node,  and  making  a  random  selection  from  amongst  all  symmetric 
neighbors  (without  regard  to  802.11a  reachability).  Thus,  the  link  may  only  be  operational 
on  802. llg.  Also,  the  destinations  are  likely  to  be  different  for  each  of  the  30  runs  for 
each  plotted  point.  Furthermore,  as  the  choice  is  made  dynamically  at  runtime,  there 
is  a  small  possibility  that  it  may  not  always  be  the  same  for  the  same  run  number  of 
different  configurations,  even  though  the  seed  is  the  same  (this  can  happen  if  the  neighbor- 
list  membership  is  different  at  the  time  of  selection,  which  is  not  very  likely  except  in  very 
lossy  scenarios,  i.e.,  very  large  shadowing  deviation  values). 

Multiple  independent  runs  for  each  data  point  were  obtained  as  follows:  for  independent 
run  i,  the  defaultRNG  object  in  ns2  was  seeded  with  a  single  selected  seed  (the  same  for 
all  runs),  and  then  the  next-substream  command  was  invoked  i  times.  Each  channel  has 
a  separate  associated  Shadowing  propagation  object  in  our  simulation  code;  each  of  these 
objects  also  has  an  associated  RNG.  These  are  not  explicitly  seeded  (we  have  changed  the 
default  ns2  behavior),  as  the  ns2  random  number  generator  automatically  assigns  a  seed 
to  each  new  RNG  coresponding  to  an  independent  stream,  once  the  defaultRNG  has  been 
seeded.  However,  the  next-substream  command  is  invoked  i  times  on  each  Shadowing  RNG 
for  run  i.  In  addition,  each  node’s  LL  has  an  RNG  which  is  used  for  the  random  destination 
choice.  These  are  also  assigned  automatic  independent  seeds  by  ns2.  The  next-substream 
command  is  invoked  i  times  on  each  of  these  RNGs  for  run  i. 

Fig.  6.33  shows  the  aggregate  throughput  when  all  ffows  are  GBR  with  rate  approx.  5.8 
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Mbps.  The  throughput  for  (1, 1, 1)  is  much  lower  than  what  we  would  ideally  expect,  due 
to  the  losses  induced  by  the  shadowing  model.  Similarly,  the  throughput  for  (12,3, 1)  and 
(12,  3, 3)  is  also  much  lower  than  the  ideal.  However,  we  do  consistently  get  approximately  a 
6-7  times  improvement  over  the  single-channel  case  by  using  multiple  channels.  While  this  is 
certainly  far  from  ideal,  given  that  there  are  12  802.11a  and  3  802. llg  channels,  it  is  actually 
quite  satisfactory,  given  the  nature  of  the  topology  and  the  traffic  pattern.  Recall  that  we 
choose  a  random  neighbor  from  the  neighbor-list  of  the  designated  source(s).  Thus,  often 
the  neighbor  may  be  reachable  only  using  802. llg  (which  has  higher  range).  Since,  there 
are  only  3  802. llg  channels,  this  can  limit  the  possible  improvement.  Another  observation 
is  that  the  average  aggregate  throughput  with  (12,  3, 1)  is  marginally  but  fairly  consistently 
higher  than  (12,3,3),  but  in  most  cases,  the  confidence-intervals  overlap  substantially,  and 
so  the  difference  is  not  very  relevent  statistically.  The  slightly  better  performance  of  (12, 3, 1) 
can  be  explained  by  the  fact  that  (12,3,3)  does  not  get  much  opportunity  to  gain  from 
dynamic  adaptation  (if  many  active  links  are  802. llg  only,  then  there  is  not  much  scope 
for  adaptation;  moreover  the  channel  estimation  procedure  is  not  very  sophisticated,  and 
may  thus  occasionally  initiate  unwarranted  R-channel  changes  on  perceiving  a  low  effective 
rate  on  the  current  channel),  but  it  does  incur  some  additional  overhead  since  more  control 
data  is  sent  when  the  poolsize  is  greater  than  1. 

Fig.  6.34  shows  the  aggregate  throughput  when  all  flows  are  FTP,  i.e.,  TCP  traffic.  The 
relative  trends  are  similar,  though  the  throughput  is  lower,  and  the  comparative  improve¬ 
ment  on  using  multiple  channels  is  also  smaller.  This  is  due  to  the  greater  impact  of  losses 
on  TCP,  even  leading  to  flow-starvation  sometimes.  The  difference  between  (12,3, 1)  and 
(12,3,3)  is  much  more  marked,  though  the  confidence  intervals  still  exhibit  overlap. 

6.7  Discussion 

The  proposed  HMCLL  protocol  is  able  to  address  a  wide  range  of  scenarios  in  a  satisfactory 
manner.  It  is  to  be  noted  that  much  of  the  benefit  of  using  dynamic  channel  adaptation 
seems  to  arise  in  scenarios  where  there  are  interface  conflicts,  or  in  scenarios  with  inter¬ 
ference  conflicts  with  the  number  of  active  links  comparable  to  the  number  of  channels. 
When  there  are  only  interference  conflicts  and  the  number  of  flows  is  much  smaller  than 
the  number  of  channels,  even  having  poolsize  1  (which  corresponds  to  a  quasi-static  combi- 
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natorially  load-balanced  assignment)  usually  works  fairly  well.  However,  having  a  poolsize 
greater  than  1  does  generally  help  improve  consistency  even  in  such  situations,  and  helps 
avoid  the  occasional  worst-case  scenarios  that  can  arise  with  poolsize  1.  The  two- level 
scheduling  component  provides  fairly  satisfactory  performance.  In  particular,  the  coupling 
introduced  by  the  local  interface  conflict  score  helps  the  LL  effectively  address  scenarios 
with  multi-mode  T-interfaces  and  scenarios  where  the  receiver  has  many  R-interfaces,  but 
the  sender  may  have  only  one  or  few  T-interfaces. 

The  LL  is  able  to  adapt  the  channel  assignment  to  CBR  traffic  much  more  easily  than  to 
TCP  traffic.  The  primary  reason  for  this  is  that  if  the  network  is  crrently  in  a  sub-optimal 
channel  assignment  configuration,  the  queues  will  build  up  in  the  CBR  case,  and  when 
the  information  propagates  within  2  hops,  it  will  likely  trigger  a  R-channel  switch  at  some 
interface(s)  to  a  less  loaded  channel.  However,  with  TCP  traffic  (especially  flows  that  just 
traverse  one- hop),  the  queues  may  never  become  very  large,  as  the  source  may  adjust  its 
rate  quickly  to  the  available  bandwidth.  Thus,  the  queue  may  not  always  build  up  to  the 
extent  needed  to  trigger  a  switch  (recall  that  we  have  an  element  of  hysteresis) .  To  alleviate 
this,  we  have  included  an  excess  utilisation  component  in  the  interface  conflict  cost.  We  also 
have  an  implicit  interference-cost  element  which  is  based  on  experienced  contention-time, 
which  helps  address  both  the  issue  of  load  due  to  TCP  flows,  and  also  load  due  to  any  type 
of  traffic  which  lies  byond  two  hops  (as  the  interface  will  not  have  explicit  information  of 
this).  However,  there  is  still  potential  for  further  improvement. 

The  results  for  the  random  topologies  with  the  shadowing  model  indicate  that  poolsize 
1  is  actually  marginally  better.  As  mentioned  earlier,  this  can  be  explained  by  the  fact 
that  there  is  limited  potential  for  improvement  through  dynamic  adaptation,  and  having  a 
poolsize  greater  than  1  implies  slightly  more  overhead,  and  can  also  cause  some  unwarranted 
channel  switches  (since  the  channel  estimation  procedure  is  quite  rudimentary,  and  involves 
little  active  probing).  Thus,  there  is  much  potential  for  improvement  along  these  lines. 

6.8  Future  Directions 

In  the  course  of  our  work  on  the  described  protocol,  we  have  identified  certain  interesting 
directions  for  future  work,  involving  both  theoretical  and  protocol  design  aspects. 
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Neighborhood  Management  Currently,  our  protocol  makes  the  implicit  assumption 
that  reachability  characteristics  are  the  same  for  all  channels  in  the  same  band.  Thus,  if 
a  neighbor  is  deemed  reachable  using  an  802.11a  channel,  then  the  effective-rate  for  all 
802.11a  channels  on  that  link  is  set  to  be  the  raw  datarate,  till  some  rate-history  has  been 
accumulated  (as  a  result  of  packet  transmissions).  However,  reachability  characteristics  can 
be  different  even  for  channels  in  the  same  band.  One  reason  for  this  is  the  possibility  of 
varying  levels  of  external  noise.  Another  reason  is  that  the  difference  in  frequency  can  lead 
to  different  propagation  characteristics.  While  one  would  expect  that  within  a  single  band, 
this  difference  would  not  have  a  significant  effect,  however,  if  two  nodes  are  at  the  fringes  of 
each  others’  transmission  range,  a  change  of  R-channel  by  one  neighbor  can  potentially  even 
make  them  unreachable.  Thus,  more  sophisticated  neighborhood  management  is  desirable, 
especially  since  this  can  have  important  implications  for  the  topology  visible  to  a  routing 
protocol,  and  can  substantially  affect  performance. 

Channel  Quality  Estimation  Design  of  efficient  probing  strategies  for  channel  estima¬ 
tion  is  an  important  direction  for  future  work,  with  need  for  theoretical  solutions,  as  well  as 
practical  strategies  based  on  theoretical  insight.  Some  results  on  optimal  probing  strategies 
for  single  user/link  case  are  available  in  the  literature,  e.g.,  [17].  But  there  is  dearth  of 
approaches  that  take  the  multi-link,  multi- hop  setting  into  account.  A  related  issue  is  that 
of  reacting  to  a  jammed  or  highly  noisy  channel. 

Suitable  Decision  Policies  The  LL  maintains  a  wide  range  of  statistics  pertaining  to 
traffic  and  channels.  There  is  typically  a  different  degree  of  confidence  for  different  statistics 
(depending  on  the  frequency  of  observation  or  reports).  Thus,  it  would  be  desirable  to 
adopt  an  approach  in  which  the  response  to  an  observation  is  dependent  on  the  degree  of 
confidence,  i.e.,  one  could  vary  the  degree  of  hysteresis  based  on  degree  of  confidence  (if 
more  confident,  the  protocol  can  react  more  promptly;  if  less  conhdent,  the  response  can 
have  more  damping).  Formulating  such  policies  is  an  interesting  direction  for  future  work. 

Implementing  a  Link  Layer  Reordering  Buffer  Since  the  LL  performs  data-striping, 
there  is  a  likelihood  of  out-of-order  packet  delivery,  when  nodes  have  multiple  R-interfaces. 
This  could  be  rectihed  by  having  a  reordering  buffer  at  the  receiving  transport  endpoint. 
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However,  it  may  be  desirable  to  keep  the  LL’s  functions  completely  transparent  to  the  higher 
layers,  so  that  no  changes  to  the  higher  layers  are  required  for  using  the  link  layer  protocol. 
Thus,  it  may  be  useful  to  implement  a  reordering  buffer  at  the  link-layer.  Since  the  data- 
striping  would  performed  by  each  local  link-layer  over  each  link,  this  can  be  done  by  using 
link  layer  sequence  numbers  for  all  transmitted  packets  over  a  link,  and  holding  received 
out-of-order  packets  in  a  buffer  till  prior  sequence  numbers  have  been  received.  This  can 
also  enable  TCP  traffic  to  derive  benefit  from  LL  data-striping  (our  current  simulations 
indicate  that  TCP  does  not  benefit  much). 

Routing  In  this  chapter,  we  described  a  link  layer  protocol  that  performs  dynamic  adap¬ 
tation.  Given  that  this  protocol  address  issues  arising  from  heterogeneity  of  interfaces  and 
channels  at  the  link  layer,  it  is  of  interest  to  devise  a  routing  protocol  that  does  not  have 
any  knowledge  of  specific  low-level  details  of  channels/radios,  etc.  This  protocol  would  take 
an  abstracted  link/node  cost  metric  from  the  LL,  and  use  it  for  route-selection  (with  the 
route  being  a  sequence  of  nodes).  With  such  an  approach,  the  same  routing  protocol  can 
work  in  a  diverse  set  of  scenarios  with  different  hardware  specifications,  since  the  knowledge 
of  low-level  details  is  encapulated  by  the  LL. 

Distance-vector  routing  is  typically  not  very  suitable  for  the  envisioned  scenarios,  as  it 
does  not  provide  enough  flexibility  in  quantifying  the  cost  of  a  route.  If  proactive  rout¬ 
ing  is  desired,  link-state  routing  appears  to  be  the  best  fit.  If  reactive  routing  is  desired, 
source-routing  seems  to  be  most  appropriate.  The  key  challenge  lies  in  designing  a  suitable 
metric  that  is  capable  of  capturing  traffic- levels  (which  lead  to  interface  bottlenecks),  avail¬ 
able  channel/interface  diversity  along  path,  and  long-term  link  conditions  along  the  path. 
However,  any  traffic-based  cost  exposed  to  the  routing  layer  should  typically  be  computed 
over  a  longer  timescale  than  costs  used  by  the  LL  decisions,  else  instability  may  result  [54]. 
Moreover,  since  the  LL  may  locally  adapt  and  cause  channel-switching  anytime  during  the 
lifetime  of  a  route,  the  metric  should  typically  not  be  based  on  current  channel  of  operation 
of  interfaces;  rather  it  should  take  into  account  the  channel-diversity  available  in  the  form 
of  the  channel-pool. 

Extension  to  wider  range  of  heterogeneons  hardware  capabilities  Another  di¬ 
rection  involves  extending  the  envisioned  stack  architecture  to  address  a  wider  range  of 
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heterogeneous  hardware  capabilities,  e.g.,  consider  a  scenario  involving  multiple  hetero¬ 
geneous  radios/channels,  as  well  as  heterogeneous  antennas.  Such  an  effort  can  be  quite 
useful,  and  can  provide  a  generic  design  template  for  a  wide  range  of  scenarios. 

Similarly,  one  could  try  to  extend  the  scope  to  include  making  decisions  about  rate/power 
at  the  link  layer,  as  well  as  address  scenarios  where  two  interfaces  of  the  same  type  may  have 
different  number /type  of  antennas,  yielding  different  reachability  characteristics  (whether 
and  at  what  rate  one  can  directly  communicate  with  a  nearby  node).  To  an  extent,  the  cur¬ 
rent  design  is  capable  of  serving  as  a  template  for  this  wider  range  of  scenarios.  The  current 
design  assumes  that  reachability  characteristics  are  solely  a  function  of  the  channels  that  a 
node  can  be  reached  on;  thus  we  have  a  set  of  channel  queues.  One  could  extend  this  to  a 
set  of  queues  for  various  combinations  of  choices  (instead  of  a  separate  level  of  interfaces 
queues  with  a  separate  IF-scheduling,  it  would  be  reasonable  to  include  the  interface-choice 
as  part  of  the  combination);  the  CH-scheduler  can  still  be  used  by  defining  appropriate 
conflict  relations  between  these  queues. 

However,  a  major  issue  in  addressing  such  multi-parameter  adaptation  is  the  resultant 
increase  in  unpredictability.  In  the  currently  addressed  scenario,  the  reachability  character¬ 
istics  are  a  function  of  the  channel,  and  of  the  availability  of  interfaces  capable  of  switching 
on  the  particular  channel  at  each  node  under  consideration;  they  are  largely,  though  not  ex¬ 
clusively,  determined  by  the  R-channel  selection  which  operates  over  much  longer  timescales 
than  packet  scheduling;  furthermore  packet  scheduling  decisions  are  done  over  a  quantum 
of  packets,  while  the  net  datarate  estimate  (used  in  the  channel-selection  decision)  is  up¬ 
dated  after  every  packet).  The  greater  the  number  of  adaptable  parameters,  the  greater  is 
the  dependence  of  the  achievable  rates  on  the  decisions  being  made  by  other  nearby  nodes 
per  packet,  which  increases  complexity.  To  handle  this,  it  may  be  beneficial  in  incorporate 
more  structure  in  terms  of  potential  multi-timescale  parameter  tuning  (akin  to  the  current 
channel  restriction),  as  well  as  possibly  increasing  the  scheduling  quantum  size  (the  goal 
being  not  amortization  of  overhead,  but  achieving  predictability  in  what  will  happen  over 
the  timescale  of  next  few  packets). 

Thus,  there  are  many  interesting  directions  worthy  of  exploration. 
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Chapter  7 

Reliable  Broadcast  in 
Failure-prone  Wireless  Networks 


The  increasing  use  of  wireless  networks  in  critical  application  scenarios  provides  motivation 
for  designing  reliable  communication  algorithms  that  can  leverage  the  distinct  characteris¬ 
tics  of  the  wireless  channel.  In  this  chapter,  we  introduce  the  reliable  broadcast  problem  in 
the  wireless  context,  and  describe  the  underlying  model  and  assumptions  for  the  results  in 
subsequent  chapters.  We  also  discuss  related  work. 

7.1  Assumptions 

We  consider  an  idealized  wireless  network.  There  is  a  single  common  channel  of  operation, 
and  all  nodes  are  equipped  with  a  single  half-duplex  transceiver.  The  wireless  channel  is 
assumed  to  be  perfectly  reliable,  i.e.,  if  a  node  transmits  a  message,  and  no  other  node  in 
the  vicinity  is  transmitting  simultaneously  (i.e.,  if  no  collisions  occur),  then  the  message  is 
guaranteed  to  be  received  by  all  nodes  within  its  range  (termed  its  neighbors).  Note  that 
this  idealized  shared  wireless  channel  intrinsically  preserves  ordering  of  messages  sent  by 
a  node,  i.e.,  if  a  node  transmits  messages  mi  and  m2  respectively  in  order,  they  will  be 
received  in  that  same  order  by  all  neighbors.  We  call  this  idealized  behavior  the  reliable 
local  broadcast  assumption.  While  this  assumption  does  not  hold  per  se  in  real  wireless 
networks,  it  may  be  possible  to  implement  a  local  broadcast  primitive  that  can  provide 
probabilistic  guarantees  (given  the  probabilistic  nature  of  wireless  channel  losses,  a  fully 
deterministic  approach  is  not  feasible  in  reality).  Such  a  primitive  could  then  be  used  as  a 
subroutine  by  a  global  broadcast  algorithm. 

We  assume  synchronous  communication.  More  specifically,  for  the  results  in  Chapter  8 
and  Chapter  9,  we  assume  that  there  is  an  underlying  collision-free  TDMA  schedule,  where 
time  is  divided  into  rounds,  and  each  node  has  a  designated  transmission  slot,  which  it  can 
use  to  transmit  without  interfering  or  being  interfered  with,  if  it  needs  to.  If  a  message  is 
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transmitted  by  a  node,  then  it  is  received  by  all  its  neighbors  within  a  bounded  amount  of 
time  (i.e.,  by  the  end  of  the  slot). 

Another  assumption  is  that  all  nodes  adhere  to  the  collision-free  schedule;  even  the 
faulty  nodes  do  not  deliberately  cause  collisions  by  transmitting  out-of-turn.  Similarly, 
they  do  not  spoof  the  MAC  addresses  of  other  nodes.  One  way  to  view  this  situation  is 
that  the  physical  (PHY)  and  medium  access  control  (MAC)  layers  of  all  nodes  are  fault-free, 
and  the  MAC  layer  does  not  allow  higher  layers  to  cause  a  change  of  MAC  addresses.  Thus, 
if  all  nodes  have  a  priori  unique  MAC  addresses,  then  each  transmitted  message  (packet) 
will  carry  the  true  and  unique  identity  of  the  node  that  transmitted  the  packet  in  its  MAC 
header.  Note  that  this  means  that  each  node  knows  the  correct  identity  of  the  previous  hop 
node  from  which  it  received  the  packet.^  However,  if  the  packet  traversed  multiple  hops, 
the  identity  of  the  original  sender  or  the  previous  hop  relays  (if  included  in  the  message 
contents),  may  be  subject  to  tampering  by  a  faulty  relay. 

For  our  results  in  subsequent  chapters,  we  consider  two  distance  metrics:  Loo  and  L2. 
The  Loo  metric  is  the  metric  induced  by  the  Loo  norm,  such  that  the  distance  between 
points  (xi,yi)  and  (x2,2/2)  is  given  by  max{|xi  —  X2I,  \yi  —  2/2!}  in  the  this  metric. 

The  L2  metric  is  induced  by  the  L2  norm,  and  is  the  Euclidean  distance  metric.  The 
L2  distance  between  points  (xi,yi)  and  {x2,y2)  is  given  by  (xi  —  ^2)^  +  (yi  —  2/2)^- 

7.2  Problem  Definition 

The  reliable  broadcast  problem  for  a  designated  source  is  dehned  as  follows: 

There  is  a  designated  source  node  in  the  network,  which  can  originate  a  message  for 
broadcast  to  the  rest  of  the  nodes  in  the  network.  The  goal  is  to  ensure  that  if  the  source  is 
non-faulty,  every  non-faulty  node  in  the  network  should  correctly  receive  and  determine  the 
value  originated  by  the  source;  if  the  source  is  faulty,  all  non-faulty  node  should  agree  on 
some  common  value.  When  a  node  decides  upon  some  value  as  being  the  broadcast  value, 
we  say  that  it  commits  to  it. 

^The  assumption  that  MAC  addresses  cannot  be  spoofed  is  also  relevant  to  scenarios  where  link-layer 
authentication  mechanisms  are  available,  but  end-to-end  authentication  is  not.  This  is  quite  pertinent  to 
sensor  network  deployments,  where  end-to-end  authentication  may  involve  too  much  overhead  to  be  justifi¬ 
able,  but  link-layer  authentication  may  be  feasible  as  it  is  much  more  lightweight.  Link-layer  authentication 
would  assure  that  a  node  receiving  a  message  is  certain  of  the  identity  of  the  neighbor  that  transmitted  that 
message. 
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7.2.1  Implications  of  Reliable  Local  Broadcast  Assumption 

As  per  the  reliable  local  broadcast  assumption,  if  a  node  transmits  a  message,  all  its  neigh¬ 
bors  are  able  to  receive  it,  and  are  able  to  do  so  within  a  bounded  amount  of  time.  This 
greatly  simplifies  the  task  of  achieving  reliable  broadcast  in  the  presence  of  a  faulty  source 
node.  Suppose  the  source  is  faulty.  There  are  two  ways  in  which  it  could  manifest  faulty 
behavior:  (1)  not  send  a  message  when  other  nodes  expect  it  to  do  so,  or  (2)  send  two 
conflicting  versions  of  the  same  message  containing  different  values.  If  case  (1)  occurs,  then 
neighbors  of  the  source  can  use  a  simple  timeout  mechanism,  whereby,  if  no  message  is 
received  from  the  source  within  a  certain  interval  of  the  expected  time,  they  commit  to  a 
default  value,  and  take  the  appropriate  steps  stipulated  by  the  algorithm  being  followed  to 
propagate  it  further.  If  case  (2)  occurs,  all  neighbors  receive  both  values,  and  the  duplicity 
of  the  source  is  detected.  Thus  the  non-faulty  neighbors  of  the  source  can  again  follow  some 
default  procedure  (either  commit  to  a  default  value,  or  to  the  first  value  received  from  the 
source),  and  take  appropriate  subsequent  steps.  Therefore,  the  source  has  no  incentive  to 
be  duplicitous. 

7.3  Related  Work 

We  now  review  some  existing  work  on  reliable  communication  in  the  presence  of  faults. 

Reliable  communication  under  Byzantine  failures  has  been  studied  for  point-to-point 
communication  networks  under  various  assumptions  [2].  The  seminal  result  of  Pease, 
Shostak  and  Lamport  [89],  [70]  states  that  in  case  of  full  connectivity,  Byzantine  agreement 
with  /  faulty  nodes  is  possible  if  and  only  if  n  >  3/-I-1.  Under  more  general  communication 
graphs,  the  requirements  for  Byzantine  agreement  are  that  n  >  3/  -|-  1,  and  the  network 
be  at  least  (2/  -|-  l)-connected  [26].  Byzantine  agreement  in  k-cast  channels  has  been  con¬ 
sidered  in  [21].  However  this  does  not  capture  the  spatially  dependent  connectivity  that 
characterizes  radio  networks.  Reliable  broadcast  in  radio  networks  has  also  been  studied 
in  [60]  and  [57].  In  [57],  an  infinite  grid  network  was  considered.  A  locally-bounded  fault 
model  was  proposed,  wherein  an  adversary  was  allowed  to  place  faults  subject  to  the  con¬ 
straint  that  no  neighborhood  have  more  than  t  faults.  It  was  shown  that  under  a  Byzantine 
failure  model,  reliable  broadcast  is  not  achievable  for  t  >  [^r(2r  -|-  1)]  (in  both  Loo  and 
L2  metrics).  Besides  a  protocol  was  described  that  was  able  to  achieve  reliable  broadcast 
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under  the  following  conditions: 

•  If  t  <  |(r(r  +  Y^+  1)),  then  reliable  broadcast  is  achievable  in  the  Loo  metric. 

•  If  t  <  j(r(r  +  +  1))  —  2,  then  reliable  broadcast  is  achievable  in  the  L2  metric. 

This  protocol  stipulates  that  nodes  wait  till  they  hear  the  same  value  from  t  +  1  neighbors 
before  they  commit  to  it,  and  re- broadcast  it  exactly  once  for  the  benefit  of  other  neighbors. 
Under  this  protocol,  no  non-faulty  node  will  ever  accept  the  wrong  value.  However,  there 
is  a  possibility  of  some  nodes  never  being  able  to  decide,  and  the  achievability  bounds  do 
not  match  the  impossibility  bound,  leaving  a  region  of  uncertainty.In  [112],  a  tight  bound 
for  tolerable  t  using  the  simple  broadcast  protocol  of  [57]  was  established. 

Further  study  of  the  locally  bounded  fault  model  is  undertaken  in  [90],  where  arbitrary 
graphs  are  considered  instead  of  a  specific  network  model.  While  the  discussion  mentions 
both  radio  and  message-passing  networks,  there  is  an  assumption  that  duplicity  by  the 
source  (sending  different  messages  to  different  neighbors)  is  impossible.  Upper  and  lower 
bounds  for  achievability  of  reliable  broadcast  are  presented,  based  on  graph-theoretic  param¬ 
eters,  for  arbitrary  graphs.  However,  no  exact  thresholds  are  established.  Two  broadcast 
algorithms  are  considered.  One  is  the  simple  algorithm  of  [57]  that  is  referred  to  as  the 
Certified  Propagation  Algorithm  (CPA).  Another  algorithm,  termed  as  the  Relaxed  Prop¬ 
agation  Algorithm  (RPA),  is  described,  which  is  t-locally  safe  (i.e.,  no  non-faulty  node  will 
commit  to  an  incorrect  value  by  following  it).  It  is  shown  that  RPA  is  a  more  powerful 
algorithm,  as  there  exist  graphs  for  which  RPA  succeeds  but  CPA  does  not.  It  is  also  shown 
that  there  exist  certain  graphs  in  which  algorithms  that  work  with  knowledge  of  topology 
succeed  in  achieving  reliable  broadcast,  while  those  that  lack  this  knowledge  fail  to  do  so. 
The  RPA  algorithm  and  our  algorithms  for  reliable  broadcast  described  in  Chapter  8  are 
quite  similar,  as  there  is  a  reliance  on  receiving  indirect  reports  about  values  committed  to 
by  nodes  through  a  sufficient  number  of  node-disjoint  paths. 

Scenarios  involving  a  collision-causing  adversary  are  addressed  in  [58,  38,  27].  The 
issue  of  achieving  broadcast  when  a  (locally  bounded)  adversary  can  cause  bounded  a 
bounded  number  of  collisions  or  address  spoofing  is  handled  in  [58].  It  presents  protocol 
transformations  that  can  lead  to  resilience  to  a  bounded  number  of  collisions  or  address 
spoofing  attempts.  It  uses  the  protocol  described  in  Section  8.4  of  Chapter  8  as  a  building 
block.  However  the  result  is  based  on  the  assumption  that  non-faulty  nodes  are  not  hindered 
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by  energy-limitations,  and  can  retransmit  messages  as  many  times  as  needed.  The  impact 
of  an  energy-budget  on  consensus  has  been  studied  for  a  single-hop  setting  in  [38],  and  it 
has  been  proved  that  non-faulty  nodes  would  require  at  least  incrementally  larger  budget 
than  faulty  nodes  to  arrive  at  a  consensus.  In  [91],  conditions  for  broadcast  have  been 
established  under  a  probabilistic  transient  failure  model,  where  faulty  behavior  also  includes 
the  possibility  of  nodes  causing  collision. 

Probabilistic  failure  are  considered  in  [91]  which  examines  the  case  of  message-passing 
and  radio  networks  with  random  transient  failures.  The  transient  failure  behavior  includes 
the  possibility  of  causing  collision. 

Communication  of  information  in  a  single-hop  multi-channel  wireless  network  with  a 
malicious  adversary  that  can  cause  collisions  concurrently  in  a  limited  number  of  channels 
has  been  considered  in  [27]. 

Also  related  is  work  in  [109]  on  unknown  fixed  identity  networks;  this  work  assumes 
that  nodes  cannot  fake  their  identity  to  their  neighbors.  Our  model  also  has  a  similar 
assumption. 

7.3.1  Crash-stop  Failures 

For  crash-stop  faults,  the  reliable  broadcast  problem  reduces  to  the  connectivity  problem. 

Crash-stop  failures  are  considered  in  [60]  for  finite  networks  comprising  nodes  located 
in  a  regular  grid  pattern.  The  focus  is  on  obtaining  algorithms  for  efficient  broadcast  to  the 
part  of  the  network  that  is  reachable  from  the  source,  and  not  on  quantifying  the  number 
of  faults  that  render  some  nodes  unreachable. 

A  grid  network  model  was  considered  in  [100]  where  nodes  are  located  at  integer  lattice 
sites  on  a  square  grid,  and  fail  independently.  Nodes  have  a  common  transmission  range  r. 
The  probability  of  not  failing  is  specified  as  p,  and  it  is  shown  that  a  sufficient  condition 
for  connectivity  and  coverage  is  that  transmission  range  r  must  be  set  to  ensure  that  node 
degree  is  (for  some  constant  ci).  It  is  also  shown  that  a  necessary  condition 

for  coverage  (and  hence  for  joint  coverage  and  connectivity)  is  that  node  degree  be  at 
least  C2(^^^)  (for  another  constant  C2-  A  fallacy  in  the  above  necessary  condition  was 
pointed  out  by  [62],  and  a  subsequent  correction  [102]  by  the  authors  of  [100]  presents 
examples  illustrating  that  the  necessary  condition  may  fail  to  hold  for  certain  sub-ranges  of 
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p.  The  issue  of  coverage  has  been  examined  in  detail  in  [62]  for  random,  grid,  and  Poisson 
deployments.  However,  the  necessary  and  sufficient  conditions  formulated  by  them  take  a 
more  complex  form,  and  do  not  point  to  a  single  f{n,p)  such  that  a  degree  of  0(/(n,p))  is 
both  necessary  and  sufficient  for  asymptotic  coverage.  Besides,  the  necessary  condition  is 
formulated  for  the  specific  case  when  lim  p  ^  0. 

n— >oo 

We  have  also  derived  results  for  crash-stop  failures  in  a  grid  network  that  yield  a  different 
expression  than  [100] ,  and  while  our  results  are  within  a  constant  factor  of  their  results  for 
most  values  of  p,  our  results  are  more  accurate  when  p  ^  0. 

In  [42] ,  it  was  proved  that  in  a  unit  area  network  with  uniformly  distributed  node  place¬ 
ment,  where  nodes  have  a  common  transmission  radius  r,  such  that  7rr^  =  OyiT+Wd) 
network  is  asymptotically  connected  with  probability  one  iff  c(n)  ^  oo.  This  constitutes  the 
case  p  =  0  for  random  networks.  Recently,  necessary  and  sufficient  conditions  for  asymp¬ 
totic  connectivity  in  a  random  network  with  low  duty  cycle  sensors  have  been  formulated 
in  [55].  This  is  equivalent  to  the  problem  of  crash-stop  failures  in  a  random  network. 

On  a  related  note,  fault-tolerant  consensus  (in  the  presence  of  channel  unreliability  and 
crash-stop  failures)  has  been  studied  in  [20].  The  focus  is  primarily  on  a  single- hop  network, 
though  some  simulation  results  for  a  multi-hop  setting  are  also  reported. 

7.3.2  Reliable  Local  Broadcast 

Much  of  the  theoretical  work  mentioned  earlier  assumes  that  the  wireless  channel  itself  is 
perfectly  reliable.  The  lossy  nature  of  the  channel  is  not  accounted  for,  and  thus  many  of 
these  results  are  not  directly  applicable  to  a  real-world  scenario.  A  proposal  to  reconcile 
the  theory  and  practice  of  wireless  broadcast  has  been  made  in  [19].  They  identify  certain 
properties  that  a  reliable  local  broadcast  should  have.  They  introduce  some  models  to 
capture  the  nature  of  losses  and  collisions,  viz.,  the  No-Collisions(NC)  model,  the  Eventual 
No-Collisions  (ENC)  model,  the  Total  Collision  (TC)  model,  and  the  Partial  Collision  (PC) 
Model.  In  a  single-hop  network  conforming  to  the  TC-model,  it  is  shown  that  consensus  is 
achievable  with  any  number  of  Byzantine/crash-stop  failures.  However,  practical  realization 
of  the  TC  model  is  not  delved  into  in  detail  (though  some  possibilities  are  hinted  at). 

Another  relevant  body  of  work  pertains  to  reliable  multicast  with  probabilistic  guaran¬ 
tees  [13],  [78]  which  seeks  to  achieve  a  scalable  solution  with  probabilistic  guarantees. 


178 


7.3.3  Fault  Detection 


A  related  area  pertains  to  failure  detection.  Algorithms  that  detect  failure  can  be  very 
useful,  as  messages  received  from  nodes  detected  as  faulty  can  then  be  excluded  from 
future  communication.  This  can  help  improve  efficiency.  A  seminal  work  in  the  area  of 
failure  detection  is  the  PMC  Model  [94]  proposed  by  Preparata,  Metze  and  Chien.  [16] 
also  pertains  to  this  theme.  Results  for  failure-detection  in  a  scenario  with  locally  bounded 
faults  are  described  in  [68] .  This  work  is  quite  relevant  as  the  locally  bounded  model  is  also 
addressed  by  us  in  Chapter  8,  in  the  context  of  reliable  broadcast.  Self-adjusting  Byzantine 
Agreement  is  considered  in  [129].  This  work  describes  how  the  Byzantine  nodes  can  be 
progressively  detected  in  a  network;  at  most  a  certain  number  of  broadcast  instances  can 
fail  before  all  faults  get  detected. 
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Chapter  8 

Reliable  Broadcast  with  Locally 
Bounded  Failures 


In  this  chapter,  we  study  the  reliable  broadcast  problem  with  a  locally  bounded  fault  oc¬ 
curence  model,  which  was  briefly  introduced  in  Chapter  7.  We  begin  by  describing  the 
model  and  notation  in  Section  8.1,  and  then  summarize  the  chapter  results  in  Section  8.2. 
We  formulate  a  sufficient  condition  for  achieving  reliable  broadcast  in  a  general  graph  with 
such  a  fault  model  in  Section  8.3.  In  Section  8.4,  we  establish  a  bound  for  achievability  of 
reliable  broadcast  in  a  grid  network  model  for  the  Loo  metric.  This  bound  matches  an  im¬ 
possibility  bound  proved  in  [57] ,  and  thus  establishes  the  exact  threshold  for  this  model.  In 
Section  8.6,  we  describe  an  approximate  result  for  the  L2  (Euclidean)  metric.  We  describe 
an  alternative  broadcast  algorithm  in  Section  8.7  which  is  also  optimal  in  the  grid  network 
for  the  Loo  metric,  in  the  sense  that  it  can  tolerate  the  maximum  number  of  tolerable  faults. 
We  discuss  interesting  issues  and  future  directions  in  Sections  8.8  and  8.9  respectively. 

8.1  Preliminaries 

We  consider  an  inhnite  wireless  network,  with  nodes  situated  on  a  grid  (where  each  grid 
square  has  side  1),  under  Byzantine  and  crash-stop  failures.  Note  that  the  grid  dehnes  the 
spatial  layout  of  nodes,  and  not  the  network  topology.  All  nodes  use  a  common  transmission 
range  r,  which  is  assumed  to  be  an  integer.  As  described  in  Chapter  7,  two  distance  metrics. 
Loo  and  L2,  are  considered.  In  the  Lqo  metric,  each  node  has  exactly  4r^  -|-  4r  neighbors. 
The  results  also  hold  for  a  hnite  toroidal  network  in  which  r  is  smaller  than  the  network 
radius.  In  scenarios  where  the  entire  network  region  is  within  distance  r  of  the  designated 
source,  reliable  broadcast  is  trivially  always  achievable  due  to  the  reliable  local  broadcast 
assumption. 

In  the  grid  network,  nodes  are  identihed  by  their  grid  location  i.e.  (x,  y)  denotes  the 
node  at  (x,y).  The  neighborhood  of  {x,y)  comprises  all  nodes  within  distance  r  of  (x,y) 
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(according  to  the  distance  metric  in  consideration)  and  is  denoted  as  nbd{x,y).  For  succint 
description,  we  define  a  term  pnbd{x,  y)  where  pnbd{x,  y)  =  nbd{x  —  l,y)  U  nbd{x  +  1,  y)  U 
nbd{x,y  —  1)  U  nbd{x,y  +  1).  Intuitively  pnbd{x,y)  denotes  the  perturbed  neighborhood  of 
{x,y),  obtained  by  perturbing  the  center  of  the  neighborhood  to  one  of  the  nodes  at  unit 
distance  from  (x,  y)  on  the  grid. 

A  non-faulty  node  shall  be  variously  alluded  to  as  an  non-faulty  or  correct  node,  while 
a  node  exhibiting  Byzantine  failure  shall  occasionally  be  referred  to  as  a  malicious  node. 
We  shall  occasionally  refer  to  nbd{S)  where  S'  is  a  set.  In  such  cases,  nbd{S)  =  (J  nbd{x). 

xdS 

The  locally-bounded  fault  occurrence  model  is  considered,  wherein  an  adversary  is  al¬ 
lowed  to  place  faults  as  it  chooses,  so  long  as  no  single  neighborhood  contains  more  than  t 
faults.  When  we  refer  to  the  neighborhood  of  a  node  v,  it  includes  v  itself.  Thus  a  correct 
node  may  have  up  to  t  faulty  neighbors,  while  a  faulty  node  may  have  up  to  (t  —  l)  neighbors 
that  are  also  faulty. 

As  was  discussed  in  Chapter  7,  we  assume  that  the  a  node  may  not  spoof  another 
node’s  MAC  address,  and  resultantly,  any  node  knows  the  correct  identity  of  the  previous 
hop  node  from  which  it  received  a  message.  No  collisions  are  possible,  i.e.,  there  exists  a 
pre-determined  collision-free  TDMA  schedule  that  all  nodes  follow. 

A  designated  source  (that  is  assumed  to  be  located  at  the  origin  of  the  grid  coordinate 
system,  w.l.o.g.)  broadcasts  a  message  with  a  binary  value.  The  objective  is  to  ensure 
reliable  broadcast  of  this  value  (see  the  definition  of  the  reliable  broadcast  problem  in 
Section  7.2). 

8.2  Summary  of  Results 

We  prove  the  following  results: 

1.  We  describe  a  general  sufficient  condition  for  reliable  broadcast  in  a  general  network 
graph  under  the  reliable  local  broadcast  assumption,  which  provides  intuition  for  the 
subsequent  grid  network  results. 

2.  We  present  a  lower  bound  in  Loo  metric  on  the  maximum  number  of  Byzantine  failures 
t  that  may  occur  in  any  given  neighborhood  without  rendering  reliable  broadcast 
impossible  in  the  grid  network  model.  We  provide  a  constructive  proof  by  describing 
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two  algorithms  that  both  achieve  reliable  broadcast  in  the  Loo  metric  whenever  t  < 
^r(2r  +  1).  This  exactly  matches  an  impossibility  bound  proved  in  [57],  and  thus 
establishes  an  exact  threshold  for  Byzantine  agreement  under  this  network  model. 
For  completeness,  we  also  study  crash-stop  failures,  and  prove  that  reliable  broadcast 
is  achievable  with  locally  bounded  crash-stop  failures  iff  the  number  of  faulty  nodes 
in  any  neighborhood  is  t  <  r(2r  -|-  1)  (in  the  Loo  metric). 

3.  We  present  approximate  bounds  for  L2,  i.e..  Euclidean  metric,  and  show  that  when  r 
is  sufficiently  large,  the  thresholds  must  lie  in  a  similar  range  as  Loo.  In  particular,  we 
argue  that  for  sufficiently  large  r,  Byzantine  agreement  is  indeed  possible  in  Euclidean 
metric  if  slightly  less  than  one-fourth  of  the  nodes  in  any  given  neighborhood  may  be 
faulty,  while  it  is  possible  to  tolerate  crash-stop  failures  if  they  are  slightly  less  than 
half  the  neighborhood  population. 

A  preliminary  version  of  some  of  the  chapter  results  was  reported  in  [5]. 

8.3  A  General  Sufficient  Condition 

Consider  a  general  undirected  graph  G  =  {V,E),  whose  topology  is  known  to  all  network 
nodes.  Designate  a  source  s  G  F  as  the  source  of  the  broadcast.  A  s-cut  is  a  partition 
C  =  {S,V  \  S)  such  that  s  £  S.  In  the  course  of  a  broadcast  operation,  S  can  potentially 
denote  the  set  of  nodes  that  have  already  had  the  opportunity  to  correctly  determine  the 
broadcast  value,  and  commit  to  it  (note  that  all  non- faulty  nodes  in  S  will  thus  indeed  have 
committed  to  the  correct  value,  while  the  behavior  of  faulty  nodes  is  indeterminate).  V\S 
can  potentially  denote  the  set  of  nodes  that  are  yet  to  do  so. 

Let  us  consider  the  case  where  G  is  a  hnite  graph.  In  this  case,  any  cut  C  may  be 
considered  as  an  envelope  for  the  advancing  frontier  of  the  broadcast  at  some  instant,  with 
further  expansion  of  the  frontier  depending  on  the  existence  of  sufficient  connectivity  across 
the  cut.  If  the  cut  C  were  indeed  encountered  during  algorithm  operation,  this  is  evidently 
true.  However,  even  if  the  cut  C  =  {S,  V\S)  were  not  actually  encountered  during  algorithm 
operation,  the  following  argument  can  be  made; 

At  any  point  of  time  t  during  algorithm  execution,  let  the  actual  frontier  be  denoted  by 
the  cut  Cactuaiit)  =  {Sactuai{t) ,V  \  Sactuai{t)).  Consider  an  algorithm  step  at  time  t'  such 
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Figure  8.1:  Equivalence  of  Cut  Conditions 

that  for  all  t  <  t' ,  Sactuai{t)  ^  S,  but  Sactuai{t')  2  S.  Thus,  at  time  t' ,  at  least  one  node 
u  G  V  \  S  crossed  over  from  E  \  5  to  5  (i.e.,  received  sufficient  information  to  be  able  to 
commit  to  the  correct  value,  and,  if  it  is  non-faulty,  indeed  committed  to  it)  from  V  \  Sactuai 
to  Sactuai-  At  time  t  <  t' ,  the  frontier  of  the  broadcast  (i.e.,  C actual)  lay  strictly  behind 
the  frontier  defined  hy  C  =  {S,V  \  S).  Thus,  if  a  node  has  access  to  sufficient  information 
flowing  to  it  from  Sactuai  to  be  able  to  cross-over,  then  it  must  necessarily  have  access  to  at 
least  as  much  information  flowing  to  it  from  S  (since  the  network  topology,  and  hence  paths 
in  the  network,  are  the  same  in  both  cases,  and  the  set  of  nodes  that  already  definitively 
know  the  correct  value  in  the  latter  case  is  a  superset  of  that  in  the  former  case),  and  be 
able  to  cross  the  cut  C  =  {S,V  \  S),  if  it  had  been  encountered.  This  is  depicted  in  Fig. 
8.1.  Hence,  the  following  two  statements  are  equivalent: 

•  Statement  1:  For  every  s-cut  {S,V  \  S)  of  the  graph  that  is  actually  encountered 
during  algorithm  execution,  some  node  u  G  V  \  S  possesses  sufficient  connectivity  to 
be  able  to  cross  over  to  S  from  V  \  S. 

•  Statement  2:  For  every  possible  s-cut  (5,  E  \  5)  of  the  graph,  assuming  all  nodes  in 
S  have  had  the  opportunity  to  make  a  correct  determination  (and  non-faulty  nodes 
have  actually  made  it),  some  node  u  G  V  \  S  possesses  sufficient  connectivity  to  be 
able  to  cross-over  to  S. 

Hence,  for  a  finite  graph,  Statement  2  does  not  impose  a  more  stringent  requirement 
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than  Statement  1.  We  remark  that  the  use  of  the  notation  t  for  time  in  the  prior  discussion 
should  not  be  confused  with  the  subsequent  use  of  t  to  denote  the  maximum  number  of 
faults  in  any  single  neighborhood. 

Lemma  43.  Given  a  finite  undirected  graph  G  =  {V,E),  Statement  1  is  a  sufficient  condi¬ 
tion  for  feasibility  of  broadcast,  and  Statement  2  is  an  equivalent  sufficient  condition. 

Proof.  This  may  be  seen  as  follows:  since  Statement  1  holds  for  every  encountered  cut, 
the  set  V  \  S  will  continue  to  decrease,  and  being  finite  will  eventually  become  empty.  At 
that  stage  S  =  V,  and  the  broadcast  will  have  successfully  reached  every  node  (and  all 
the  non- faulty  nodes  will  have  made  a  determination  of  the  correct  value).  Statement  2  is 
equivalent  to  Statement  1,  and  is  hence  also  a  sufficient  condition.  □ 

It  now  remains  to  characterize  what  constitutes  sufficient  connectivity  to  be  able  to 
cross  over  to  the  source  side  of  the  cut.  The  goal  of  any  reliable  broadcast  algorithm  is 
that  each  non-faulty  node  should  be  able  to  eventually  decide  on  the  correct  broadcast 
value.  If  at  any  instant,  the  frontier  is  represented  by  cut  C  =  {S,V  \S),  then  by  the 
assumption  of  Statement  2,  all  nodes  in  S  have  correctly  determined  the  broadcast  value. 
Any  communication  of  information  across  the  cut  must  happen  through  the  nodes  in  6*5  = 
{v  G  S\3  {v,u)  G  E  such  that  u  G  L  \  S'}.  Therefore,  for  the  purpose  of  analysis,  it  suffices 
to  transform  the  source  side  of  the  cut  S  to  S'  =  Ssup  U  6*5  U  {nbd{Cs)  H  S),  with  Ssup  being 
a  new  super-source  node  that  acts  as  an  abstract  sender  of  the  correct  broadcast  value,  and 
is  connected  directly  to  each  node  in  Cg  (via  pseudo-edges).^  Other  edges  between  included 
vertices  are  preserved.  The  neighbors  of  vertices  in  Cs  on  the  source  side  are  included  to 
enforce  the  per- neighborhood  fault  constraint  amongst  the  vertices  in  Cg.  We  refer  to  the 
corresponding  graph  induced  by  V  =  S'  U  {V  \  S),  with  the  pseudo-edges  added,  as  the 
reduced  graph  G'  =  {V',E'). 

We  state  and  prove  the  following  sufficient  condition: 


Theorem  14.  Given  a  finite  undirected  graph  G  =  {V,  E)  and  designated  source  s,  with 
upto  t  byzantine  faults  in  any  neighborhood,  reliable  broadcast  is  achievable  in  G  if  every 
s-cut  C  =  (S,  V\S)  (with  Cs  denoting  the  set  of  vertices  that  have  at  least  one  incident  edge 
^This  captures  the  fact  that  all  non-faulty  nodes  in  Cs  have  determined  the  correct  value. 
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Figure  8.2:  Connectivity  to  super-source 

crossing  the  cut)  satisfies  the  following:  3  m  G  V\S  such  that  either  {s,u)  G  E  or  there  exist 
(2t-|-l)  node- disjoint  Sgup  u  paths  in  the  transformed  graph  G' ,  such  that  all  intermediate 
nodes  on  these  paths  lie  within  the  neighborhood  of  some  single  node  v  /  Ssup  £  V'  ■ 

Proof  Since  all  nodes  in  S,  and  hence  Cs  C  S,  have  had  the  opportunity  to  correctly 
determine  the  broadcast  value  (by  assumption),  the  addition  of  pseudo-edges  with  Sgup 
ensures  this  same  property  (since  neighbors  of  the  source  can  trivially  determine  the  value 
correctly  due  to  the  reliable  local  broadcast  assumption) ,  while  removing  from  consideration 
nodes  that  are  no  longer  relevant  to  the  result  we  seek  to  prove.  If  a  node  is  connected 
to  Ssup  via  at  least  2t  -|-  1  node-disjoint  paths  that  all  lie  within  some  single  neighborhood, 
then  at  most  t  of  these  paths  may  have  a  faulty  node  (as  no  more  than  t  faults  may  exist  in 
any  single  neighborhood).^  Thus,  the  node  u  will  eventually  receive  the  correct  value  over 
at  least  t  1  node-disjoint  paths,  and  will  be  in  a  position  to  commit  to  it.  The  situation 
is  illustrated  in  Fig.  8.2. 

By  Lemma  43,  this  is  a  sufficient  condition  for  finite  graphs.  □ 

Corollary  5.  Given  a  finite  undirected  graph  G  =  {V,  E)  and  designated  source  s,  with 
upto  t  crash-stop  faults  in  any  neighborhood,  reliable  broadcast  is  achievable  in  G  if  every 

^Also  note  that  each  node  is  aware  of  the  correct  identity  of  the  previous  hop  node  from  which  it  received 
a  message,  and  thus  the  identity  of  the  last  faulty  node  on  a  path  is  always  revealed;  hence  u  will  not 
consider  any  other  path  through  this  faulty  node  when  counting  the  number  of  disjoint  paths  through  which 
a  value  was  received.  This  ensures  that  u  will  count  at  most  t  faulty  paths  for  a  value,  and  prevents  faulty 
nodes  from  confusing  u  even  if  they  tamper  with  previous  hop  path  information. 
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s-cut  C  =  {S,V  \  S)  (with  Cs  denoting  the  vertices  for  which  at  least  one  incident  edge 
crosses  the  cut)  satisfies  the  following:  3u  G  V\S  such  that  either  (s,  u)  G  E  or  there  exist 
(t  +  1)  node-disjoint  {ssup,u)  paths  in  the  reduced  graph  G' ,  such  that  all  intermediate  nodes 
on  these  paths  lie  within  the  neighborhood  of  some  node  v  Sgup- 

Proof  When  crash-stop  failures  are  considered,  reachability  is  synonymous  with  achiev- 
ability  of  reliable  broadcast.  If  a  non- faulty  node  is  a  neighbor  of  s,  it  will  trivially  receive 
the  broadcast.  If  a  node  is  connected  to  t  -|-  1  nodes  in  S  via  one  path  each  such  that  all 
t  1  paths  are  node-disjoint,  and  lie  in  a  single  neighborhood,  then  at  most  t  of  these  can 
be  faulty.  Thus,  there  will  be  at  least  one  fault-free  path  through  which  the  node  may  be 
reached,  and  the  broadcast  can  propagate  further.  □ 

Infinite  Graphs  For  any  finite  fault-threshold  t,  one  can  argue  that  Theorem  14  also 
holds  for  infinite  graphs  as  follows:  Suppose  the  condition  stated  in  Theorem  14  holds,  but 
it  is  impossible  for  some  nodes  to  determine  the  correct  broadcast  value.  Consider  the  set 
D  comprising  all  such  nodes  that  are  not  able  to  eventually  determine  the  correct  value. 
Evidently,  none  of  the  nodes  in  D  can  be  a  neighbor  of  the  source  s,  else  such  a  node 
would  trivially  have  the  opportunity  to  determine  the  correct  value.  Therefore,  they  are 
all  non-neighbors  of  s.  By  assumption,  all  nodes  inV  \  D  eventually  have  the  opportunity 
to  determine  the  correct  value.  Consider  the  corresponding  cut  (V  \  D,D).  Then,  using 
the  proof  argument  of  Theorem  14,  there  exists  some  node  u  G  D  such  that  there  are  at 
least  2f  -|-  1  node-disjoint  Sgup  ^  u  paths  in  the  transformed  graph  G'  for  cut  iV  \  D,D). 
Consider  exactly  2f  -|- 1  of  these  paths.  Note  that  the  nodes  neighboring  Sgup  on  these  paths 
are  2f  -|-  1  in  number  and  belong  to  V  \  D.  By  assumption,  in  the  actual  network,  these 
2f  -|-  1  nodes  will  eventually  have  the  opportunity  to  determine  the  correct  value.  Once 
these  2t  -|-  1  nodes  have  had  the  opportunity  to  determine  the  correct  value,  u  would  also 
eventually  receive  information  from  enough  node-disjoint  paths,  and  have  the  opportunity 
to  determine  the  correct  value.  This  yields  a  contradiction. 

The  same  argument  can  be  used  for  Corollary  5. 

8.4  Byzantine  Failures  in  a  Grid  Network 

We  prove  the  following  result  for  locally  bounded  Byzantine  failures  in  the  grid  network; 
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Theorem  15.  Ift<  ^r(2r  +  1),  reliable  broadcast  is  achievable  in  the  grid  network  for  the 
Loo  metric. 

We  present  an  algorithm  to  achieve  reliable  broadcast,  based  on  the  same  intuition  as 
the  general  sufficient  condition  of  Theorem  14.  Without  loss  of  generality,  we  assume  that 
the  message  comprises  a  binary  value  (say  0  or  1).  A  non- faulty  node  that  is  not  the  source 
is  said  to  commit  to  a  value  when  it  decides  that  it  is  indeed  the  value  originated  by  the 
source.  The  algorithm  requires  maintenance  of  state  by  each  node  pertaining  to  messages 
received  from  nodes  within  its  two-hop  neighborhood.  The  algorithm  operates  as  follows: 

•  Initially,  the  source  does  a  local  broadcast  of  the  message. 

•  Each  neighbor  i  of  the  source  commits  to  the  first  value  v  it  heard  from  the  source 
and  does  a  one-time  local  broadcast  of  a  COMMITTED{i,  v)  message. 

•  Hereafter,  the  following  algorithm  is  followed  by  each  node  j  (including  those  involved 
in  the  previous  two  steps): 

On  receipt  of  a  COMMITTED{i,  v)  message  from  neighbor  i,  record  the  message, 
and  broadcast  a  HEARD{j,  i,  v)  message. 

On  receipt  of  a  HEARD{k,i-,v)  message  from  neighbor  k,  record  the  message,  but  do 
not  re-propagate. 

On  committing  to  a  value  v,  do  a  one-time  local  broadcast  of  a  COMMITTED{j,  v) 
message. 

A  node  j  commits  to  a  value  v,  if  it  has  not  already  committed  to  a  value,  and 
it  becomes  certain  about  value  v.  A  node  is  said  to  be  certain  about  a  value  v  if  it 
receives  v  through  COMMITTED  or  HEARD  messages  over  at  least  t+\  node-disjoint 
paths  that  lie  within  a  single  neighborhood.  More  precisely,  a  node  j  is  certain  of  a 
value  V  if  there  is  a  node  Q  such  that  j  received  some  t  +  1  messages  mi,  m2,  ...,mt+i 
where  mi  =  COMMITTED{Ai,v)  or  m*  =  HEARD{Ai,  Ar  ,v),  and  all  the  Ai,A'j^  are 
distinct  nodes  lying  in  the  neighborhood  of  Q.^ 

faulty  intermediate  node  can  alter  the  affixed  identity  of  the  previous  node  listed  in  the  HEARD 
message  (this  is  part  of  the  message  content,  which  can  be  altered).  This  does  not  cause  a  problem  as  the 
identity  of  such  a  faulty  intermediate  node  (let  us  call  it  x)  on  the  forwarding  path  will  always  be  revealed 
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Theorem  16.  No  non-faulty  node  shall  commit  to  a  wrong  value  by  following  the  previously 
deseribed  algorithm. 

Proof.  The  proof  is  by  contradiction.  Consider  the  first  non-faulty  node,  say  j,  that  makes  a 
wrong  decision  to  commit  to  value  v.  Evidently,  j  cannot  be  a  neighbor  of  the  source.  This 
implies  it  received  the  value  v  from  at  least  t  +  1  nodes  through  a  single  path  (direct  or  two- 
hop)  each,  such  that  all  t  -|-  1  paths  are  node-disjoint,  and  lie  in  some  single  neighborhood. 
Since  the  number  of  faults  in  any  single  neighborhood  may  be  at  most  t,  it  implies  that  at 
most  t  of  these  paths  could  have  a  faulty  source  (of  a  COMMITTED  message)  or  a  faulty 
intermediate  node  (that  sends  a  HEARD  message).  Thus,  all  paths  cannot  have  relayed 
the  wrong  value,  and  so  v  must  indeed  be  the  correct  value.  □ 

Theorem  17.  Each  non-faulty  node  is  eventually  able  to  commit  to  the  eorrect  value. 

Proof.  We  prove  that  each  non-faulty  node  will  be  able  to  meet  the  conditions  stipulated 
by  the  algorithm  for  committing  to  the  correct  value.  The  proof  also  clarifies  the  operation 
of  the  algorithm.  Intuitively,  the  essence  of  the  proof  lies  in  showing  that  each  node  P 
(other  than  the  direct  neighbors  of  (0,  0)  which  can  trivially  determine  the  correct  value) 
can  receive  information  from  a  part  of  the  network  that  has  already  committed  to  the 
correct  value,  along  {2t  -|-  1)  node-disjoint  paths  lying  in  some  single  neighborhood.  This  is 
akin  to  the  general  sufficient  condition  of  Section  8.3. 

The  proof  is  by  induction. 

Base  Case:  All  non-faulty  nodes  in  nbd{0,  0)  are  able  to  commit  to  the  correct  value. 
This  follows  trivially  from  our  assumed  model  since  they  all  hear  the  source  directly. 

Inductive  Hypothesis:  If  all  non-faulty  neighbors  of  a  node  located  at  (a,  b)  i.e.  all 
non-faulty  nodes  in  nbd{a,  b)  are  able  to  commit  to  the  correct  value,  then  all  non-faulty 
nodes  in  pnbd{a,  b)  are  able  to  commit  to  the  correct  value. 

to  j  (x  must  j’s  neighbor,  as  the  forwarding  paths  involve  only  two  hops,  and  hence  j  knows  its  identity  as 
MAC  addresses  cannot  be  spoofed).  Resultantly,  even  if  x  has  altered  the  identity  of  the  node  before  it  on 
a  forwarding  path,  this  is  acceptable,  as  j  will  not  include  any  other  message  with  a  path  through  x  in  the 
set  of  t  -I-  1  messages,  and  resultantly  given  only  t  faulty  nodes  in  the  neighborhood,  at  most  t  out  of  the 
t  -f  1  paths  can  involve  faulty  information. 
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Proof  of  Inductive  Hypothesis:  We  show  that  for  each  node  P  in  pnbd{a,  b)  \  nbd{a,  b) 
there  exists  a  set  of  2t+l  paths  {tti,  7r2, 7r2t+i}  of  the  form  =  {Ai,P)  orvTj  =  {Ai,  A[,  P), 
such  that  all  Ai,A^  are  distinct,  lie  in  some  single  neighborhood,  and  all  Ai  G  nbd{a,b). 
Since  no  more  than  t  of  the  Ai,A^  can  be  faulty,  this  guarantees  that  the  node  will  receive 
the  correct  value  through  at  least  (t  +  1)  paths,  and  will  also  commit  to  it. 

Consider  a  node  P  belonging  to  nbd{a,b  +  1).  The  argument  for  nodes  in  nbd{a,b  — 
l),nbd{a  —  1,6)  and  nbd{a  +  1,6)  is  similar. 

Node  P  in  nbd{a,  b  +  l)\nbd{a,  6)  may  be  considered  to  be  located  at  (a  —  r+p,  6+r  + 1) 
where  {0  <  p  <  2r}  (Fig.  8.3).  We  present  an  explicit  argument  for  locations  of  P 
corresponding  to  {0  <  p  <  r}.  A  similar  argument  holds  for  the  remaining  locations,  by 
virtue  of  symmetry. 

We  show  the  existence  of  r(2r  +  1)  node-disjoint  paths  vri,  7r2, ..., 7r^(2r-i-i))  that  all  he 
within  the  same  single  neighborhood  (centered  at  (a,  6  -|-  r  -|-  1),  and  indicated  by  the 
dark-edged  square  in  Fig.  8.3).  The  region  marked  A  comprises  {(x,p)|(a  —  r)  <  x  < 
(a  -|-  p);  (6  -|-  1)  <  y  <  {b  +  r)},  and  nodes  in  this  region  lie  in  nbd{a,b),  and  are  also 
neighbors  of  P.  Thus,  there  are  r(r  -|-  p  -|-  1)  paths  of  the  form  A  ^  P.  The  region  B 
comprises  {(x,  y)|(a-|-p-|-  1)  <  x  <  {a  +  r);{b  +  l)  <  y  <  (6-|-r)},  and  falls  in  nbd{a,  6).  The 
region  B'  is  obtained  by  a  translation  of  B  to  the  left  by  r  units,  and  then  up  by  r  units. 
Thus,  region  B'  comprises  {(x,  y)|(a  -|-  p  -|-  1  —  r)  <  x  <  a;  (6  -|-  r  -|-  1)  <  y  <  {b  +  2r)},  and 
falls  in  nbd{P).  Consequently,  there  is  a  one-to-one  correpondence  between  a  point  (x,?/) 
in  B  and  a  point  {x  —  r,y  +  r)  in  B’ ,  such  that  the  points  in  each  pair  are  neighbors.  This 
yields  r(r  —  p)  paths  of  the  form  B  ^  B'  ^  P. 

Thus,  r(2r  -|-  1)  node-disjoint  paths  are  obtained. 

Observe  that  the  inductive  hypothesis  along  with  the  base  case  suffice  to  show  that 
every  non-faulty  node  will  eventually  commit  to  the  correct  value,  since  starting  at  (0,0), 
one  can  cover  the  entire  infinite  grid  by  moving  up,  down,  left  and  right.  Therefore,  non- 
faulty  nodes  in  the  neighborhood  of  every  grid  point  can  be  shown  to  be  eventually  able  to 
determine  the  broadcast  value.  □ 
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Figure  8.3;  Existence  of  Sufficient  Connectivity 

8.5  Crash-Stop  Failures  in  a  Grid  Network 

When  only  crash-stop  failures  occur,  the  sole  criterion  for  achievability  is  reachability,  and 
no  special  algorithm  is  required.  Each  node  that  receives  a  value  commits  to  it,  re-broadcasts 
it  once  for  the  benefit  of  others,  and  then  may  terminate  local  execution  of  the  algorithm. 

In  this  failure  mode,  we  establish  an  exact  threshold  for  tolerable  faults  in  Loo  metric. 
The  impossibility  bound  is  trivial  to  derive  but  we  state  and  prove  it  here  for  the  sake  of 
completeness. 

Theorem  18.  Under  a  crash- stop  failure  model,  ift  >  r(2r-|-l)  ,  it  is  impossible  to  aehieve 
reliable  broadeast  in  the  grid  network,  with  the  Loo  metrie. 

Proof.  We  present  a  construction  with  t  =  r(2r  -|-  1)  that  renders  reliable  broadcast  impos¬ 
sible.  Consider  the  network  in  Eig.  8.4.  The  nodes  in  the  designated  region  {(x,y)\a  < 
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Figure  8.4;  Network  Partition  due  to  Crash  Stop  Failures 


X  <  a  +  r}  (for  some  a  >  1)  are  all  faulty  while  all  other  nodes  are  non- faulty.  As  may  be 
seen,  the  maximum  number  of  faulty  nodes  in  any  given  neighborhood  is  at  most  r(2r  -|-  1). 
However  this  configuration  partitions  all  nodes  in  the  half-plane  x  >  a  -|-  r  from  the  source 
and  they  are  unable  to  receive  the  broadcast.  □ 

The  achievability  bound  can  be  obtained  from  the  result  for  the  Byzantine  model. 

Theorem  19.  Under  a  crash-stop  failure  model,  if  t  <  r(2r  -|-  1),  it  is  possible  to  achieve 
reliable  broadcast  in  the  grid  network,  with  the  L^o  metric. 

Proof.  Consider  the  proof  for  the  byzantine  fault-tolerant  algorithm  in  Section  8.4.  Given 
that  nbd{a,  b)  has  decided,  there  exist  r(2r  -|- 1)  node-disjoint  paths  of  the  form  described  in 
Theorem  17  that  lie  in  one  single  neighborhood.  Since  t  <  r(2r  -|-  1),  at  least  one  path  will 
be  fault-free,  thereby  enabling  the  broadcast  to  propagate  to  pnbd{a,  b).  Thus,  by  inductive 
reasoning,  all  fault-free  nodes  on  the  grid  will  receive  the  broadcast.  □ 

8.6  Euclidean  Metric 

In  this  section,  we  consider  the  issue  of  reliable  broadcast  in  the  L2,  i.e..  Euclidean  metric. 
We  refrain  from  establishing  exact  thresholds  as  it  is  difficult  to  precisely  determine  lattice 
points  falling  in  areas  bounded  by  circular  arcs.  We  present  an  approximate  argument 
showing  that  reliable  broadcast  in  L2  is  achievable  if  slightly  less  that  one-fourth  fraction  of 
nodes  in  any  neighborhood  exhibit  Byzantine  faults.  We  work  with  the  value  t  <  0.247rr^. 


Figure  8.5:  Illustrating  an  Approximate  Argument  for  Euclidean  Metric 

The  basis  for  the  approximate  argument  is  that,  given  a  closed  simple  region  of  area 
A,  and  perimeter  p,  bounded  by  upto  k  straight  line  segments  and  circular  arcs  of  radius 
r,  where  /c  is  a  small  constant,  the  number  of  lattice  point  lying  within  it,  Ni,  is  given  by 
Ni  =  A  ziz  0{p),  and  the  constant  hidden  in  the  0{p)  term  is  small.  The  justification  for 
this  claim  is  based  on  Pick’s  Theorem  [113],  and  is  presented  in  Appendix  D. 

Therefore,  for  sufficiently  large  r,  the  number  of  nodes  that  lie  in  various  considered 
subregions  of  a  circle  of  radius  r  (elaborated  later)  are  approximately  A±0(r)  each  (where 
A  is  the  area  of  that  subregion).  Thus,  we  expect  the  argument  to  hold  well  for  large  values 
of  r. 

The  argument  uses  induction,  as  in  the  previous  section. 

Base  Case:  All  non-faulty  nodes  in  nbd{0,  0)  are  able  to  commit  to  the  correct  value. 
This  follows  trivially  since  they  hear  the  origin  directly. 

Inductive  Hypothesis:  If  all  non-faulty  neighbors  of  a  node  located  at  (a,  b)  are  able  to 
commit  to  the  correct  value,  then  all  non-faulty  nodes  in  pnbd{a,  b)  are  able  to  commit  to 
the  correct  value. 
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Figure  8.6;  Approximate  Construction  depicting  Node-Disjoint  Paths  (NQ  from  Fig.  8.5 
rotated  to  x-axis) 

Justification  of  Inductive  Hypothesis:  We  show  that  each  node  mpnbd{a,  b)\nbd{a,  b) 
is  connected  to  2t+l  nodes  in  nbd{a,  b)  via  one  path  each,  such  that  all  these  2t-|-l  paths  are 
node-disjoint  and  they  all  (the  endpoints,  as  well  as  any  intermediate  nodes)  lie  entirely  in 
one  single  neighborhood.  Since  no  more  than  t  of  these  can  be  faulty,  this  would  guarantee 
that  the  node  will  receive  the  correct  value  through  at  least  t  +  1  such  paths,  and  commit 
to  it. 

Consider  the  node  at  (a,  6),  as  in  Fig.  8.5.  Let  d  be  the  distance  between  the  node  at 
(a,  b)  (we  call  it  node  N)  and  any  node  in  {pnbd{a,  b)  \  nbd{a,  b))  (we  call  it  node  Q).  Then 
d  <r  +  1  (from  the  triangle  inequality).  It  suffices  to  consider  the  possibility  d  =  r  +  1,  as 
that  yields  the  least  overlap  between  the  neighborhoods  of  N  and  Q. 

We  consider  the  situation  in  Fig.  8.6  with  NQ  from  Fig.  8.5  rotated  to  the  horizontal 
axis  for  clarity  of  presentation.  We  attempt  to  construct  node-disjoint  paths  that  all  he 
within  the  neighborhood  centred  at  M  (the  midpoint  of  NQ)  or  the  grid  location  nearest 
to  it.  If  M  is  itself  not  a  grid  point,  the  resultant  perturbation  of  the  neighborhood  centre 
to  the  nearest  grid  location  can  only  affect  the  presented  calculations  by  0(r).  The  set  of 
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nodes  marked  A  are  common  neighbors  of  P  and  Q  and  constitute  one- hop  paths  {A  ^  Q). 
A  set  of  two-hop  paths  Bi  ^  B2  ^  Q  is  also  formed  where  each  point  {x,  y)  in  region 
Bi  has  a  corresponding  point  in  B2  (its  image  under  reflection  by  axis  00').  Thus,  in  an 
approximate  sense,  for  almost  each  grid-points  in  Bi,  we  can  find  a  unique  grid-point  in  B2 
with  which  it  can  be  paired  (with  upto  0{r)  unpaired  grid  points  remaining). 

The  number  of  paths  is  thus  approximately  equal  to  the  sum  of  the  areas  A  and  Bi, 
which  turns  out  to  be  approximately  1.538r^  =  0.497rr^  >  (2(0.247rr^)  -|-  1)  for  sufficiently 
large  r.  The  details  of  the  calculation  are  presented  in  Appendix  D.  Thus  approximately 
0.247rr^  Byzantine  faults  may  be  tolerated. 

We  also  argue  that  reliable  broadcast  is  not  possible  if  t  >  O.Svrr^  (approximately).  The 
argument  is  based  on  a  construction  identical  to  that  presented  in  [57]  for  Loo,  which  is 
depicted  in  Fig.  8.7.  As  proved  in  [57],  this  arrangement  of  faults  renders  reliable  broadcast 
impossible  (see  [57]  for  details).  Note  that  the  maximum  number  of  faults  lying  in  any  single 
neighborhood  is  given  by  the  number  of  faulty  nodes  in  the  circled  region  (Fig.  8.7).  The 
relevant  area  is  approximately  O.G-Trr^,  and  we  expect  approximately  O.Gvrr^  ±0(r)  nodes  to 
lie  in  it.  Half  of  these,  i.e.,  around  O.Svrr^  ±  0(r)  are  to  be  faulty.  This  yields  the  argument 
that  if  t  >  O.Svrr^  (approximately),  reliable  broadcast  would  be  unachievable.  Thus  the 
critical  threshold  for  L2  metric  would  lie  between  a  0.24  and  a  0.3  fraction,  i.e.,  in  the 
vicinity  of  a  one-fourth  fraction  of  faults. 

Observe  that  the  above  argument  also  implies  that  around  2t  =  O.dSvrr^  crash-stop 
failures  may  be  tolerated,  while  around  O.Gvrr^  failures  per  neighborhood  would  render 
reliable  broadcast  impossible. 

8.7  An  Alternative  Broadcast  Algorithm 

In  this  section,  we  describe  an  alternative  algorithm.  Though  this  algorithm  requires  greater 
message  overhead  than  the  algorithm  described  in  Section  8.4,  it  is  of  some  interest,  as  it 
demonstrates  the  existence  of  a  stronger  localized  connectivity  property  in  the  grid,  which 
may  possibly  have  relevance  in  contexts  other  than  reliable  broadcast. 

As  in  Section  8.4,  we  assume  w.l.o.g.  that  the  message  to  comprise  a  binary  value  (say 
0  or  1).  A  node  that  is  not  the  source  is  said  to  commit  to  a  value  when  it  decides  that  it 
is  indeed  the  value  originated  by  the  source. 
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Figure  8.7:  Impossibility  Construction  for  Byzantine  Failures  in  Euclidean  metric 

The  algorithm  requires  maintenance  of  state  by  each  node  pertaining  to  direct /indirect 
report  messages  for  nodes  within  its  four-hop  neighborhood. 

The  algorithm  operates  as  follows; 

•  Initially,  the  source  does  a  local  broadcast  of  the  message. 

•  Each  neighbor  i  of  the  source  immediately  commits  to  the  the  first  value  v  it  heard 
from  the  source,  and  then  locally  broadcasts  it  once  in  a  COMMITTED[i,  v)  message. 

•  Hereafter,  the  following  algorithm  is  followed  by  each  node  j  (including  those  involved 
in  the  previous  two  steps): 

On  receipt  of  a  COMMITTED{i,  v)  message  from  neighbor  i,  record  the  message,  and 
locally  broadcast  a  HEARD{j,  i,  v)  message. 

On  receipt  of  a  HEARD{k,  i,  v)  message  from  a  neighbor  k,  record  the  message,  and 
locally  broadcast  a  HEARD{j,  k,i,v)  message. 

On  receipt  of  a  HEARD{1,  k,i,v)  message,  record  the  message,  and  locally  broadcast 
a  HEARD{j,  I,  k,  i,  v)  message. 

On  receipt  of  a  HEARD{g,l,  k,i,v)  message,  record  the  message,  but  do  not  re¬ 
propagate. 

On  committing  to  a  value  v,  do  a  one-time  local  broadcast  of  COMMITTED{j,  v). 


195 


A  node  j  commits  to  a  value  v  if  it  has  not  already  comitted  to  a  value,  and  it  reliably 
determines  that  at  least  t  +  1  nodes  lying  in  some  single  neighborhood  have  committed 
to  u.  j  is  said  to  have  reliably  determined  the  value  committed  to  by  node  i  if  one  of 
the  following  conditions  holds; 


—  i  is  its  neighbor,  and  j  heard  COMMITTED{i,  v)  directly.  In  this  case,  there  is 
no  cause  for  doubt  as  to  the  value  announced  by  node  i,  since  no  other  node  is 
capable  of  spoofing  i’s  address,  and  collisions  are  ruled  out. 

—  j  heard  indirect  reports  of  i  having  committed  to  a  particular  value  v  through 
t  +  1  node-disjoint  paths  that  all  lie  within  some  single  neighborhood.  The  in¬ 
direct  reports  are  obtained  through  the  HEARD  messages  that  propagate  via 
upto  three  intermediate  nodes  (i.e.,  upto  four  hops  from  the  node  that  sent  the 
COMMITTED  message),  and  the  path  information  is  obtained  from  these  mes¬ 
sages  (as  each  forwarding  node  affixes  its  identifier  to  the  message).^ 

Theorem  20.  No  non-faulty  node  shall  commit  to  a  wrong  value  by  following  the  above 
algorithm. 

Proof.  The  proof  is  by  contradiction.  Consider  the  first  non-faulty  node,  say  j,  that  makes 
a  wrong  decision  to  commit  to  a  value  v.  Evidently,  j  cannot  be  a  neighbor  of  the  source. 
This  implies  it  reliably  determined  that  t  +  1  already  committed  nodes  lying  in  some  single 
neighborhood  A^i  had  committed  to  v.  Since  reliable  determination  of  a  node  i  having  com¬ 
mitted  to  a  value  v  involves  hearing  i  directly  or  hearing  indirect  reports  (that  i  committed 
to  v)  via  at  least  t  +  1  node-disjoint  paths  lying  in  some  single  neighborhood  N2,  and  since 
the  number  of  faults  in  N2  may  be  at  most  t,  all  these  paths  cannot  have  relayed  the  wrong 
value,  and  v  must  indeed  be  the  committed-to  value  announced  by  i.  Thus,  no  non-faulty 

^Note  that  a  faulty  intermediate  node  can  affix  a  false  identity  for  itself,  or  alter  the  affixed  identities 
of  previous  nodes  listed  in  the  message  it  is  forwarding  (these  are  part  of  the  message  content,  which  can 
be  altered).  This  does  not  cause  a  problem  as  the  identity  of  the  last  faulty  node  (let  us  call  it  x)  on  the 
forwarding  path  will  always  be  revealed  to  j  (either  x  is  j’s  neighbor,  in  which  case  j  knows  its  identity  as 
MAC  addresses  cannot  be  spoofed,  or  there  is  some  other  non-faulty  node  on  the  forwarding  path  after  x 
which  knows  the  message  was  relayed  through  x  since  it  knows  the  correct  MAC  adddress  of  the  previous 
hop  node).  Thus,  even  if  x  has  affixed  a  wrong  identity  for  itself  in  the  message  path  information,  the  next 
non-faulty  node  can  detect  this  and  rectify  the  situation,  and  subsequent  relays  are  all  non-faulty.  Therefore, 
j  will  know  that  x  lies  on  the  path.  Hence,  even  if  x  has  altered  the  identities  of  nodes  before  it  on  the 
forwarding  path,  this  is  acceptable,  as  j  will  not  consider  any  other  message  with  a  path  through  x,  and 
resultantly  given  only  t  faulty  nodes  in  the  neighborhood,  at  most  t  out  of  the  f  +  1  paths  can  involve  faulty 
information. 
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node  can  make  a  wrong  determination  of  what  value  each  of  the  t+1  nodes  in  committed 
to  (or  claimed  to  commit  to).  Since  j  is  the  first  non- faulty  node  to  make  a  wrong  decision, 
the  non-faulty  nodes  amongst  the  t  -|-  1  nodes  could  not  have  made  a  wrong  decision,  and 
a  value  committed  to  and  announced  by  such  nodes  must  be  correct.  Also,  all  of  the  t  +  1 
nodes  cannot  be  faulty,  as  no  more  than  t  nodes  in  any  neighborhood  may  exhibit  Byzantine 
failure.  Therefore,  at  least  one  other  non-faulty  node  previously  committed  to  v.  So,  it 
must  indeed  be  the  correct  value,  else  we  would  obtain  a  contradiction.  □ 

Theorem  21.  Each  non-faulty  node  is  eventually  able  to  commit  to  the  correct  value. 

Proof.  We  prove  that  each  non-faulty  node  will  be  able  to  meet  the  conditions  stipulated  by 
the  algorithm  for  committing  to  the  correct  value.  The  essence  of  the  proof  lies  in  showing 
that  each  node  j  other  than  the  direct  neighbors  of  (0,  0)  is  connected  to  at  least  2t  -|-  1 
nodes  that  lie  in  some  single  neighborhood  Ni ,  such  that  the  connectivity  to  each  such  node 
is  through  2t  -|-  1  node-disjoint  paths  that  all  he  in  some  neighborhood  N2,  and  the  nodes 
in  A^i  are  able  to  commit  to  the  correct  value  before  node  j  has  done  so. 

The  proof  is  by  induction. 

Base  Case:  All  non-faulty  nodes  in  nbd{0,  0)  are  able  to  commit  to  the  correct  value. 
This  follows  trivially  from  the  assumed  model,  since  they  hear  the  origin  directly. 

Inductive  Hypothesis:  If  all  non-faulty  neighbors  of  a  node  located  at  (a,  b)  i.e.  all 
non-faulty  nodes  in  nbd{a,  b)  are  able  to  commit  to  the  correct  value,  then  all  non-faulty 
nodes  in  pnbd{a,  b)  are  able  to  commit  to  the  correct  value. 

Proof  of  Inductive  Hypothesis:  We  show  that  each  node  in  pnbd{a,  b)  \  nbd{a,  b)  is 
able  to  reliably  determine  the  value  committed  to  by  2t  -|-  1  nodes  in  nbd{a,b).  Since  no 
more  than  t  of  these  can  be  faulty,  this  guarantees  that  the  node  will  become  aware  of  t  -|-  1 
nodes  in  nbd{a,  b)  having  committed  to  a  (the  correct)  value,  and  will  also  commit  to  it.  In 
order  to  show  this,  we  prove  that  each  node  is  connected  to  at  least  2t-|- 1  nodes  in  nbd{a,  b) 
either  directly,  or  through  2t  -|-  1  node  disjoint  paths  that  all  lie  entirely  within  some  single 
neighborhood.  Thus  at  least  t  -|-  1  of  these  paths  are  guaranteed  to  be  fault-free  and  shall 
allow  communication  of  the  correct  value. 
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Figure  8.8:  Nodes  in  nbd{a,  b)  whose  commit-  Figure  8.9:  Nodes  in  nbd{a,  b)  that  are  im- 
ted  values  P  can  reliably  determine  mediate  neighbors  of  P 

We  show  this  for  a  corner  node  in  pnbd{a,  b)  \  nbd{a,  b),  i.e.,  the  node  marked  P  (which 
is  located  at  (a  —  r,  b+r  +  1))  in  Fig.  8.8.  This  represents  the  worst  case.  For  all  other  nodes 
in  pnbd{a,b)  \  nbd{a,b),  the  condition  can  be  seen  to  be  achieved  via  a  similar  argument. 
We  briefly  discuss  this  later. 

We  show  that  node  P  is  able  to  reliably  determine  the  values  committed  to  by  the  nodes 
in  the  shaded  region  M  in  Fig.  8.8.  Region  M  comprises  {{a—r+p,  b—r+q)\2r  >  q  >  p  >0} 
and  hence  has  r(2r  -|-  1)  nodes. 

The  first  observation  is  that  P  can  directly  hear  the  nodes  in  the  shaded  sub-region  R  in 
Fig.  8.9,  comprising  {(x, y)|(a  —  r)  <  x  <  a;  (6-|-  1)  <y<  (b  +  r)}  (this  constitutes  r(r -|-  1) 
nodes),  and  so  can  trivially  reliably  determine  the  value  they  committed  to.  The  remaining 
sub-regions  are  depicted  in  Fig.  8.10  as  U  (comprising  ^r(r  —  1)  nodes).  Si  (comprising  r 
nodes  ),  and  S2  (  comprising  ^r(r  —  1)  nodes). 

We  now  explicitly  prove  existence  of  suitable  node-disjoint  paths  for  nodes  that  lie  in 
the  upper  triangular  region  U  in  Fig.  8.10.  Any  node  N  in  this  region  may  be  considered 
located  at  {a  +  p,b  +  q)  (Fig.  8.11),  such  that  r>q>p>l  in  this  region.  We  show  the 
existence  of  r(2r  -|-  1)  node-disjoint  paths  between  N  and  P,  that  all  lie  within  the  same 
single  neighborhood  (centered  at  (a,  6-|-r-|-l),  and  indicated  by  the  square  with  dark  outline 
in  Fig.  8.12).  For  greater  clarity,  the  spatial  extents  of  various  demarcated  regions  used  in 
the  following  argument  are  tabulated  in  Table  8.1. 

Consider  Fig.  8.12.  The  region  marked  A  comprises  {{x,y)\{a+p  —  r)  <  x  <  a;  (6-1-1)  < 
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Figure  8.10:  Nodes  in  nbd{a,b)  to  which  P 
has  sufficient  connectivity 


^  =  6  +  r  + 1 


y  = 


b 


y  =  b  —  r 


Figure  8.11:  A  node  N  in  Region  U 


Region 

x-extent 

y-extent 

A 

{a+p  —  r)<x<a 

(6  +  1)  <y<  {b  +  q  +  r) 

Bi 

{a  +  1)  <  X  <  {a  +  p  —  1) 

(6+1)  <y<  (b  +  q  +  r) 

B2 

{a  +  1  —  r)<x<{a  +  p— 1  —  r) 

(6  +  1)  <y<  (b  +  q  +  r) 

Cl 

{a  +  p  +  1)  <  X  <  {a  +  r) 

(b  + q  +  1)  <  y  <  (b  + r  +  1) 

C2 

{a  +  p  +  1  —  r)  <  X  <  a 

(b  +  q  +  l  +  r)  <y  <  (6  +  1  +  2r) 

Di 

(a+p)  <  X  <  {a  +  p  +  r  —  q) 

(b  +  r  +  q-  p  +  1)  <y<  (b  +  r  +  q) 

D2 

{a  +  1)  <  X  <  {a  +  p) 

(b  +  l  +  r  +  q)  <y  <  (6  +  1  +  2r) 

Ds 

{a  +  1  —  r)  <  X  <  {a  +  p  —  r) 

(b  +  l  +  r  +  q)  <y  <  (6  +  1  +  2r) 

J 

{a  —  2r)  <  X  <  a 

(6  +  1)  <y<  (b-p  +  r) 

Ki 

(a  —  2r)  <  X  <  a 

(b-p+l)<y<b 

K2 

{a  —  2r)  <  X  <  a 

(b  —  p  +  r  +  1)  <y<  (b  +  r) 

Table  8.1:  Spatial  Extents  of  Various  Regions 
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Figure  8.12:  Construction  depicting  node-disjoint  paths  between  N  and  P 


y  <  {b  +  q  +  r)},  and  nodes  in  this  region  are  neighbors  of  both  N  and  P.  Thus,  there  are 
(r  —  p-|-  l)(r  +  g)  paths  of  the  form  N  ^  A  ^  P  that  comprise  one  intermediate  node  each. 

The  region  Bi  comprises  {(x,  y)|(a  -|-  1)  <  x  <  (a  -|-  p  —  1);  (6  +  1)  <  y  <  {b  +  q  +  r)}, 
and  falls  in  nbd{N)  (recall  that  N  is  located  at  (a  -|- p,  6  -|-  q)).  The  region  B2  comprises 
{{x,  y)|(a  +  1  —  r)  <  x  <  {a  +  p  —  1  —  r)]  {b  +  1)  <  y  <  {b  +  q  +  r)},  and  falls  in  nbd{P). 
As  may  be  seen,  B2  is  obtained  by  a  translation  of  Bi  to  the  left  by  r  units.  Thus,  there  is 
a  one-to-one  correpondence  between  a  node  at  {x,y)  in  Bi  and  a  node  at  (x  —  r,y)  in  B2, 
such  that  the  nodes  in  each  pair  are  neighbors.  This  yields  {p  —  l)(r  -|-  q)  paths  of  the  form 
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H  H 

Figure  8.13:  Connectivity  between  P  and  nodes  in 


N  ^  Bi^  B2^  P. 

Region  Cl  comprises  {(x,  ?/)|  (a+p+1)  <  x  <  (a+r);  (b+q+l)  <  y  <  (6+r+l)}  and  thus 
falls  within  nbd{N).  Region  C2  comprises  {{x,y)\{a  +  p  +  1  —  r)  <  x  <  a;  {b  +  q  +  1  +  r)  < 
y  <  (6  +  1  +  2r)}  and  falls  within  nbd{P).  It  may  be  seen  that  there  is  a  one-to-one 
correspondence  between  any  node  at  (x,  y)  in  Ci  and  the  node  at  {x  —  r,y  +  r)  in  C2,  with 
the  paired  nodes  being  neighbors.  Hence  there  exist  {r  —  p){r  —  q  +  1)  paths  of  the  form 
N  ^  Cl  ^  C2  ^  P  that  comprise  two  intermediate  nodes  each. 

Regional  comprises  {{x,y)\{a+p)  <  x  <  (a+p+r—q),  (b+r+q—p+l)  <y<  (b+r+q)}, 
and  falls  in  nbd{N).  Region  D2  comprises  {(x, y)|(a-|- 1)  <  x  <  {a+p);  (b+l  +  r  +  q)  <  y  < 
{b  +  1  +  2r)}  .  Region  ZI3  comprises  {{x,y)\{a  +  1  —  r)  <  x  <  {a  +  p  —  r);  {b  +  1  +  r  +  q)  < 
y  <  (6  -f  1  -f  2r)},  and  falls  in  nbd{P).  We  note  that  regions  Hi,  D2  and  H3  have  exactly 
the  same  number  of  nodes  each.  Besides,  the  regions  Hi  and  H2  are  mutually  located  in  a 
manner  that  each  node  in  H2  is  a  neighbor  of  each  node  in  Hi  (maximum  distance  between 
any  two  nodes  <  r).  Hence,  any  one-to-one  pairing  of  nodes  in  Hi  with  nodes  in  H2  is 
valid.  Further,  a  node  located  at  (x,  y)  in  H2  has  a  one-to-one  correpondence  with  a  node 
(x  —  r,  y)  in  H3.  Hence,  there  are  p{r  —  (7  -|-  1)  paths  of  the  form  N  Hi  ^  H2  ^  H3  ^  P 
that  comprise  three  intermediate  nodes  each  (Fig.  8.12).  Thus  the  r(2r  -f  1)  node-disjoint 
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paths  are  obtained. 

We  now  consider  nodes  in  regions  Si  and  S2  depicted  in  Fig.  8.10. 

5i  =  {(a  —  r,  6  —  p)|0  <  p  <  (r  —  1)}.  It  can  be  shown  that  P  has  r(2r  +  1)  disjoint 
paths  to  each  node  in  5i,  as  depicted  in  Fig.  8.13.  Any  node  N  in  Si  is  located  at 
{a  —  r,b  —  p)  where  0  <  p  <  (r  —  1).  Consider  region  J  comprising  {(x,y)|(a  —  2r)  < 
X  <  a;  (6  +  1)  <  y  <  {b  —  p  +  r)}.  All  nodes  in  J  are  common  neighbors  of  N  and 
P,  and  provide  (r  —  p){2r  +  1)  paths  of  the  form  N  ^  J  ^  P.  Region  Ki  comprises 
{(x,y)|(a  —  2r)  <x<a;{b  —  p  +  l)  <  y  <  6},  and  falls  enirely  within  nbd{N).  Region  K2 
is  {{x,y)\{a  —  2r)  <  x  <  a]  (b  —  p  +  r  +  1)  <  y  <  (6  +  r)},  and  falls  in  nbd{P).  For  each 
node  (x,  y)  falling  in  Ri,  there  is  a  one-to-one  correspondence  with  a  node  (x,  y  -|-  r)  in  A'2, 
and  thus  we  obtain  p{2r  +  1)  paths  of  the  form  N  Ki  K2  P.  This  yields  a  total  of 
r(2r  -|-  1)  paths  (all  lying  entirely  within  nbd{a  —  r,b  +  1)),  as  depicted  in  Fig.  8.13. 

Region  S2  comprises  {(a  —  q,b  —  p)\{r  —  1)  >  y  >  p  >  0}.  For  the  nodes  in  52,  observe 
that  each  node  (a  —  y-|- 1,  6  — p-|- 1)  in  52  possesses  the  same  relative  position  w.r.t.  P  as  the 
node  {a  +  p,b  +  q)  in  region  U  of  Fig.  8.10  (note  the  axial  symmetry  about  axis  OO'),  and 
due  to  the  symmetric  structure  of  the  network,  shall  enjoy  exactly  the  same  connectivity 
properties  to  P  as  the  node  {a+p,  b  +  q)  in  region  U .  Since  we  have  already  shown  existence 
of  sufficient  connectivity  for  those  nodes,  the  same  holds  for  nodes  in  52. 

The  inductive  hypothesis,  along  with  the  base  case,  suffices  to  show  that  every  non- 
faulty  node  will  eventually  commit  to  the  correct  message  value,  since  starting  at  (0, 0),  one 
can  cover  the  entire  infinite  grid  by  moving  up,  down,  left  and  right.  Thus,  non-faulty  nodes 
in  the  neighborhood  of  every  grid  point  can  be  shown  to  be  able  to  eventually  determine 
the  broadcast  value. 

□ 

Non- worst  Case  Location  of  P  We  briefly  discuss  how  the  connectivity  argument 
holds  for  all  P  £  pnbd{a,b)  \  nbd{a,b).  We  consider  non-worst  case  locations  of  P  G 
{(a  —  r-|-Z,6-|-r-|-l)|l  <  I  <  r}.  For  all  other  locations,  the  argument  holds  by  symmetry. 
The  situation  is  depicted  in  Fig.  8.14.  One  may  consider  P  to  be  translated  to  the  right  by 
I  units  from  its  worst  case  location  at  (a  —  r,b  +  r  +  1).  Then,  region  R  that  lies  in  direct 
range  of  P  (recall  from  Fig.  8.9)  now  comprises  r(r  ^  1)  nodes.  If  we  also  translate 

regions  U,  Si,  and  S2  by  I  units  each  to  the  right,  they  preserve  their  relative  positions  and 
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Figure  8.14;  Non- worst  Case  Location  of  P 


hence  connectivity  to  P.  However,  now  —  1)  nodes  from  U  fall  out  of  nbd{a,b),  but 
this  is  more  than  compensated  by  the  increase  of  rl  nodes  in  region  R.  Thus,  if  we  count 
the  number  of  nodes  in  nbd{a,b)  n  U,  nbd{a,b)  n  Si,  and  nbd{a,b)  n  S2,  it  can  be  shown 
that  they  are  at  least  r(r  —  1)  in  number.  Together  with  the  r(r  +  /  -|-  1)  nodes  in  region  R, 
they  provide  at  least  r(2r  -|-  1)  nodes  to  which  P  is  connected  either  directly  or  via  2t  +  1 
node-disjoint  paths  all  lying  within  some  single  neighborhood. 

8.7.1  Comparison  of  the  Two  Algorithms 

The  algorithm  described  in  this  section  is  based  on  the  stronger  condition  that  every  node  in 
pnd{a,  b)\nbd{a,  b)  has  2t-|-l  node-disjoint  paths,  all  lying  within  some  single  neighborhood, 
to  each  of  2t-|-l  nodes  in  nbd{a,  b).  The  algorithm  described  in  Section  8.4  relies  on  a  simpler 
condition,  and  yet  suffices  to  ensure  reliable  broadcast.  It  is  also  more  efficient  in  terms  of 
greater  localization  of  propagated  messages.  The  alternative  algorithm  is  still  of  interest,  as 
the  particular  localized  connectivity  property  may  possibly  find  use  in  distributed  operations 
other  than  reliable  broadcast. 
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8.8  Discussion 


In  this  chapter,  we  stated  and  proved  results  regarding  the  number  of  Byzantine  and  crash- 
stop  failures  that  may  be  tolerated  in  an  idealized  wireless  network  without  rendering 
reliable  broadcast  impossible.  We  considered  a  locally-bounded  adversarial  model  where 
the  adversary  is  free  to  choose  faulty  nodes,  so  long  as  the  placement  satisfies  the  constraint 
that  no  neighborhood  has  more  than  t  faults.  However,  in  the  presence  of  channel  errors 
etc.,  the  reliable  local  broadcast  assumption  that  underlies  these  results  is  not  trivial  to 
realize.  Thus,  implementation  of  a  reliable  broadcast  service  based  on  this  model  would 
require  efficient  implementation  of  a  reliable  local  broadcast  primitive  that  operates  under 
realistic  network  conditions.  In  Chapter  10,  we  consider  this  issue  in  some  detail. 

8.9  Future  Directions 

In  this  chapter,  we  described  results  for  achievability  of  reliable  broadcast  with  locally 
bounded  failures.  However,  we  did  not  study  the  efficiency  of  the  algorithms.  Thus,  it 
would  be  of  interest  to  determine  the  optimal  communication  complexity  for  achieving 
reliable  broadcast  for  the  grid  network,  as  well  as  a  wider  class  of  network  models.  More¬ 
over,  our  focus  was  on  a  single  broadcast  instance;  in  typical  application  scenarios,  the 
broadcast  operation  will  occur  many  times.  In  such  scenarios,  incorporating  fault-detection 
mechanims  can  allow  one  to  achieve  weaker  properties  similar  to  the  self-adjusting  Byzan¬ 
tine  agreement  of  [129],  which  are  often  sufficient  to  meet  reliability  requirements.  This  is 
a  particularly  promising  approach  in  the  wireless  context,  since  the  broadcast  nature  of  the 
wireless  medium  may  make  it  easier  to  detect  faulty  behavior.  Therefore,  it  is  very  relevant 
to  consider  designing  such  algorithms  for  wireless  network  scenarios. 
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Chapter  9 

Reliable  Broadcast  with 
Probabilistic  Failures 


In  this  chapter,  we  consider  the  problem  of  reliable  broadcast  in  wireless  networks  with 
probabilistic  failures.  Our  primary  focus  is  on  Byzantine  failures,  but  we  have  also  briefly 
addressed  the  case  of  crash-stop  failures.  We  begin  by  introducing  the  model  and  notation 
in  Section  9.1,  and  then  summarize  the  chapter  results  in  Section  9.2.  We  describe  a 
general  necessary  condition  in  Section  9.3.  We  present  necessary  and  sufficient  conditions 
for  reliable  broadcast  in  a  toroidal  grid  network  in  Section  9.4  and  Section  9.5  respectively, 
assuming  the  Lqo  distance  metric.  A  sufficient  condition  for  random  networks  is  presented 
in  Section  9.6.  Results  for  grid  networks  with  crash-stop  failures  are  discussed  in  Section 
9.7.  In  Section  9.8  we  discuss  how  the  Loo  metric  results  can  be  used  to  obtained  results 
for  the  L2  metric,  and  in  Section  9.9,  we  argue  for  the  validity  of  the  results  even  in  non- 
toroidal  networks.  We  also  identify  an  interesting  but  intuitive  similarity  in  the  structure  of 
the  results  (previously  known  results,  as  well  as  the  results  presented  in  this  chapter)  for  a 
set  of  related  problems  pertaining  to  connectivity  and  reliable  broadcast.  This  is  discussed 
in  Section  9.10. 

9.1  Preliminaries 

We  consider  two  spatial  layout  models  for  the  network: 

1.  A  regular  grid  layout,  where  nodes  are  located  on  a  two-dimensional  square  grid  (each 
grid  unit  is  a  1  x  1  square).  We  shall  refer  to  this  as  a  grid  network. 

2.  A  network  in  which  the  node  locations  are  independently  and  identically  (i.i.d.)  dis¬ 
tributed  over  the  deployment  region.  We  shall  refer  to  this  as  a  random  network. 

In  both  models,  the  network  is  assumed  to  be  deployed  over  a  -^/n  x  ^/n  square  region. 
Each  node  is  assumed  to  be  aware  of  the  locations  of  all  nodes  within  its  transmission  range. 
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Recall  the  definition  of  the  reliable  broadcast  problem  with  a  designated  source  in  Chap¬ 
ter  7.  For  the  results  in  this  chapter,  we  assume  that  any  node  in  the  entire  network  can 
be  the  designated  source  and  can  originate  a  broadcast  message.  Given  such  a  broadcast 
instance,  if  even  one  non-faulty  node  (in  either  model)  fails  to  make  a  valid  value  deter¬ 
mination,  the  broadcast  is  deemed  to  have  failed.  Thus,  reliable  broadcast  is  said  to  fail 
in  a  given  fault  configuration  if  it  fails  for  at  least  one  possible  choice  of  the  designated 
broadcast  source. 

For  a  given  broadcast  instance,  once  an  origin/source  is  designated,  it  is  identified  as 
(0,0).  All  nodes  can  then  be  uniquely  identified  by  their  coordinate  location  (x,y)  w.r.t. 
this  origin.  In  the  grid  network  model,  the  node  coordinates  are  always  integers,  while  for 
random  networks  they  are  real  numbers.  All  nodes  have  a  common  transmission  radius 
r{n,p)  (often  abbreviated  as  r).  For  grid  networks,  we  assume  that  r{n,p)  is  an  integer, 
and  for  random  networks  it  is  allowed  to  be  any  real  number. 

In  the  toroidal  grid  network,  each  node  has  the  same  number  of  neighbors  (i.e.,  the 
same  degree).  We  use  d{n,p)  (often  abbreviated  as  d)  to  denote  the  common  node  degree 
for  this  model.  The  neighbor-set  of  a  node  u,  including  itself,  is  denoted  by  nbd{u).  The 
set  of  neighbors  excluding  itself  is  denoted  by  nbd'{u)  =  nbd{u)  \  {u}. 

For  the  grid  network,  in  the  Lqo  metric,  the  degree  of  a  node  is  4r^  -|-  4r,  while  the 
population  of  a  neighborhood  (including  the  neighborhood  center)  is  d  -|-  1  =  4r^  -|-  4r  -|-  1. 
Thus,  the  minimum  node  degree  is  dmin  =  8,  corresponding  to  r  =  1. 

For  succint  description,  we  also  define  a  term  pnbd{x,  y)  where  pnbd{x,  y)  =  nbd{x  — 
l,y)  U  nbd{x  -|-  l,y)  U  nbd{x,y  —  1)  U  nbd{x,y  -|-  1).  Intuitively  pnbd{x,y)  denotes  the 
perturbed  neighborhood  of  {x,y),  obtained  by  perturbing  the  center  of  the  neighborhood  by 
±1  along  the  x  and  y  axes.  We  use  Bernoulli{p)  to  denote  a  Bernoulli  random  variable 
with  parameter  p. 

A  random  failure  mode  is  assumed  wherein  each  node  can  fail  with  probability  p  in¬ 
dependently  of  other  nodes.  Failures  are  permanent.  We  primarily  focus  on  Byzantine 
failures.  In  the  Byzantine  failure  mode,  a  faulty  node  can  behave  arbitrarily,  in  contrast 
to  crash-stop  failures,  where  a  faulty  node  simply  stops  functioning.  As  stated  in  Chapter 
7,  we  assume  that  the  Byzantine  nodes  cannot  spoof  addresses  or  cause  collisions,  i.e.,  the 
MAC  layer  is  assumed  fault-free,  and  the  Byzantine  faults  reside  only  in  higher  layers  of  the 
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protocol  stack. ^  Note  that  while  the  occurrence  of  the  permanent  failures  is  probabilistic, 
the  failed  Byzantine  nodes  can  thereafter  choose  to  behave  in  a  worst-case  manner  (i.e., 
collude  and  modulate  the  messages  they  send  to  cause  most  confusion  to  non- faulty  nodes). 
The  non-faulty  nodes  do  not  know  which  nodes  have  failed.  The  wireless  channel  conforms 
to  the  reliable  local  broadcast  assumption  described  in  Chapter  7. 

When  we  use  the  term  critical  transmission  range  for  reliable  broadcast,  we  imply 
the  smallest  transmission  range  that  can  ensure  reliable  broadcast  with  high  probability 
(w.h.p.). 

Thus: 

•  When  we  say  that  the  critical  transmission  range  is  n(/(n,p)),  we  imply  that: 

3  Cl  >  0,  such  that  when  r(n,p)  <  cif{n,p)  :  lim  Pr[reliable  broadcast  achievable]  <  1 


Thus,  the  transmission  range  must  necessarily  be  greater  than  cif{n,p)  for  reliable 
broadcast  to  be  achievable  w.h.p. 

•  When  we  say  the  critical  transmission  range  is  0{f{n,p)),  we  imply  that: 

3  C2  >  0,  such  that  when  r(n,p)  >  C2f{n,p)  :  lim  Pr [reliable  broadcast  achievable]  =  1 


Thus,  the  smallest  transmission  range  needed  to  achieve  reliable  broadcast  is  no  more 
than  C2f{n,p). 

•  When  we  say  that  the  critical  range  is  &{f{n,p)),  we  imply  that  it  is  ^l{f{n,p))  and 
0{f{n,p)). 

In  a  grid  network,  with  the  L^c  metric  (discussed  in  Section  9.1),  the  node  degree  is 
exactly  determined  by  specifying  the  transmission  range.  Hence,  we  can  define  the  notion 
of  critical  degree  correponding  to  the  critical  transmission  range.  Thus: 

•  When  we  say  that  the  critical  degree  is  Q,{g{n,p)),  we  imply  that: 

3  ai  >  0,  such  that  when  d{n,p)  <  aig{n,p)  :  lim  Pr  [reliable  broadcast  achievable]  <  1 

n— >oo 

methodology  to  handle  a  bounded  number  of  collisions  and  address-spoofing  was  proposed  in  [58]  for 
a  locally  bounded  fault  model.  It  might  be  possible  to  adapt  it  to  handle  the  random  failure  model.  This 
requires  further  investigation. 
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This  yields  a  necessary  condition. 


•  When  we  say  that  the  critical  degree  is  0{g{n,p)),  we  imply  that; 

3  02  >  0,  such  that  when  d  >  a2g{n,p)  :  lim  Pr [reliable  broadcast  achievable]  =  1 

n— >O0 

This  yields  a  sufficient  condition. 

•  When  we  say  that  the  critical  degree  is  Q{g{n,p)),  we  imply  that  it  is  Q{g{n,p))  and 
0{g{n,p)) 

In  a  random  network,  the  degrees  of  individual  nodes  can  vary;  however,  it  is  possible 
to  define  a  notion  of  eritical  average  degree,  which  is  the  average  degree  corresponding  to 
the  critical  transmission  range. 

9.2  Summary  of  Results 

In  this  chapter,  we  show  that; 

1.  In  a  network  of  n  nodes  deployed  in  a  regular  grid  pattern,  when  nodes  exhibit  Byzan¬ 
tine  failure  with  failure  probability  p  <  5  (see  later  sections  for  precise  range  of  va¬ 
lidity),  the  critical  node  degree  (defined  in  Section  9.1)  for  asymptotic  achievability 

of  reliable  broadcast  is  0  (  dmin  +  ] — 1  — i —  )  •  This  may  alternatively  be  stated 

V  / 

as  0  (^min  +  d(^7||p)^  where  Qi  denotes  the  Bernoulli{^)  distribution,  P  denotes 
the  Bernoulli{p)  distribution,  and  D{Q\\P)  denotes  the  relative  entropy  (or  Kullback- 
Leibler  distance)  between  distributions  Q  and  P. 

2.  In  a  network  of  n  nodes  located  uniformly  at  random  over  the  network  region,  when 

nodes  exhibit  Byzantine  failure  with  failure  probability  p  <  |,  the  critical  aver¬ 
age  node  degree  for  reliable  broadcast  is  Oilnn  +  — i — )(also  expressible  as 

™  2^+™  2(l-p) 

O  (  1  1”,)) — ]— )  for  this  regime). 

3.  For  crash-stop  failures  in  a  grid  deployment,  the  problem  of  reliable  broadcast  is 
equivalent  to  connectivity  in  the  presence  of  faults.  For  this  case,  we  have  derived 
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Figure  9.1:  Division  of  network  into  disjoint  neighborhoods 


results  showing  that  the  critical  node  degree  is  0  \^dmin  +  J  with  failure  probabil¬ 
ity  p  <  1  (see  later  sections  for  precise  range  of  validity).  Our  results  improve  upon 
previous  results  for  crash-stop  failures  in  a  grid  proved  in  [100]  in  the  regime  p  ^  0. 


A  preliminary  version  of  the  chapter  results  was  reported  in  [9]. 


9.3  General  Necessary  Condition  for  Byzantine  Failures 

In  this  section,  we  show  that  if  at  least  half  the  neighbors  of  a  non-faulty  node  not  in 
nbd{s)  are  faulty  in  the  Byzantine  sense,  then  the  faulty  nodes  can  make  it  commit  to  the 
wrong  broadcast  value  with  probability  at  least  We  remark  that  it  is  possible  for  a  node 
to  refrain  from  committing  to  any  value  (in  which  case  it  would  not  commit  to  the  wrong 
value).  However,  if  a  non-faulty  node  does  not  commit  to  any  value,  then  this  implies  failure 
of  the  reliable  broadcast  operation,  and  from  the  perspective  of  achievability  of  broadcast 
this  is  no  better  than  committing  to  a  wrong  value.  Thus,  we  focus  on  the  case  where  a 
node  does  indeed  commit  to  some  value. 

Theorem  22.  Under  the  assumption  that  all  message  values  are  equally  likely,  if  a  non- 
faulty  node  u  ^  nbd{s)  has  at  least  half  faulty  neighbors,  then  it  can  be  made  to  commit  to 
an  erroneous  value  with  probability  at  least 

Proof.  Assume  that  the  message  is  drawn  from  {0, 1}.  A  non-faulty  node  u  which  is  not  an 
immediate  neighbor  of  the  source  must  rely  on  messages  received  from  its  neighbors.  Recall 
that  nbd\u)  =  nbd{u)  \  {n}  and  d  =  \nbd\u)\. 
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First  consider  any  deterministic  function  that  takes  as  argument  messages  received  from 
all  neighbors  and  outputs  one  of  0  or  1.  Corresponding  to  each  fault  configuration  Ci  with 
t  >  I  faults  in  nbd!  {u)  (this  also  implies  t  faults  in  nbd{u)  as  u  is  non-faulty),  there  is 
another  configuration  C2  with  t  faults  in  nbd'{u),  such  that  all  non-faulty  nodes  in  Ci  are 
faulty  in  C2,  while  the  non-faulty  nodes  in  C2  were  all  faulty  in  Ci.  Then  the  faulty  nodes 
can  modulate  their  message-sending  behavior  so  that  u  is  unable  to  distinguish  between  the 
case  where  the  correct  broadcast  value  was  0  and  fault  conhguration  was  Ci  and  the  case 
when  the  correct  value  was  1  and  the  fault  configuration  was  C2  (recall  that  once  failure 
has  happened,  the  faulty  nodes  can  exhibit  worst-case  behavior). 

Stated  formally:  suppose  C  nbd' (u)  is  the  set  of  faulty  neighbors  in  Ci,  and  = 
nbd' {u)  \  is  its  complement,  i.e.,  the  set  of  non-faulty  neighbors.  Then  we  know  that 
|5i|  >  >  |5c|_ 

Consider  a  fault  configuration  C2  in  which  the  set  of  faulty  neighbors  is  ^2  =  U  V 
where  V  C  5i  is  some  subset  of  that  satisfies  |V|  =  |5i|  —  |5f|.  Let  S2  denote  the 
complement  of  82-  It  is  easy  to  see  that  |5i|  =  |52|.  Consider  the  case  where  the  correct 
value  is  0,  and  fault  configuration  is  Ci.  Then  all  nodes  in  5i  can  behave  as  though  the 
value  were  1,  while  the  nodes  in  will  always  act  according  to  value  0.  Now  suppose 
the  correct  value  is  1,  and  the  fault  conhguration  is  C2.  Then  the  faulty  nodes  in  C  82 
behave  as  though  the  value  were  0,  while  nodes  in  V  =  ^2  \  act  as  per  the  correct  value 
1.  The  non-faulty  nodes  in  82  always  act  as  per  value  1.  From  the  viewpoint  of  node  u, 
the  two  situations  are  indistinguishable. 

Next  consider  the  possibility  of  using  a  probabilistic  decision  rule.  Given  a  set  of  mes¬ 
sages  received  from  neighbors,  we  need  to  consider  the  conditional  probability  that  the  value 
is  0  or  1.  From  the  above  discussion  it  is  clear  that  for  a  given  set  of  received  messages  from 
neighbors,  there  exists  a  pair  of  fault  conhgurations,  and  associated  faulty-node  behavior, 
with  the  same  number  of  faulty  neighbors,  where  the  correct  message  values  are  different. 
Since  failures  are  i.i.d.  with  probability  p,  and  each  value  0  or  1  is  equiprobable,  u  cannot 
hope  to  choose  the  correct  one  with  a  probability  greater  than  half. 

It  is  not  hard  to  see  that  if  the  message  can  have  more  than  two  possible  (equiprobable) 
values,  it  cannot  increase  the  probability  of  correct  choice. 

□ 
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If  the  failure  probability  p  is  at  least  | ,  it  can  be  seen  that  the  probability  that  at  least 
half  the  neighbors  of  a  given  node  are  faulty  is  at  least  |  (for  even  node  degree,  this  follows 
from  Lemma  50;  for  odd  degree,  we  can  first  argue  for  p  =  5  and  then  use  a  monotonicity 
argument).  Therefore,  it  is  only  relevant  to  study  the  achievability  of  broadcast  for  p  <  |. 


9.4  Byzantine  Failures  in  a  Grid  Network:  Necessary 
Condition 

Note  that  when  p  =  0,  it  is  still  necessary  to  ensure  that  each  node  has  non-zero  de¬ 
gree  for  broadcast  to  be  possible,  and  this  requires  that  r  be  set  to  1  (and  hence  d  = 
drain  =  8).  Thus,  when  p  =  0,  it  trivially  follows  that  the  node  degree  must  be  at  least 
maxjdmm,  1  — I — I  (we  adopt  the  standard  convention  that  x  log  ^  =  oo  for  any  x  >  0; 

2{l-p) 

we  also  adopt  the  convention  that  -^  =  0  for  any  finite  y  >  0). 

Hence,  the  case  of  interest  is  when  p  >  0.  It  is  easy  to  see  that  r  >  1  (correspondingly 
d  >  dmin  =  8)  is  necessary  for  any  p. 

Theorem  23.  Assuming  the  L^o  distance  metric,  in  a  grid  network  where  nodes  can  fail 
(in  a  Byzantine  sense)  independently  with  probability  p  such  that  0  <  p  <  ^  —  if  the 
node  degree  is  d  <  - — i  — t —  .• 

Pr[  reliable  broadcast  fails  ]  =  1 


Proof.  It  is  evident  that  r{n,p)  must  be  at  least  I  for  reliable  broadcast,  else  all  nodes  in 
the  grid  are  isolated.  Thus  d(n,p)  must  be  at  least  dmin  =  8.  Therefore,  in  the  rest  of  the 
proof,  we  only  need  to  consider  the  case  where  ^ — i —  >  dmin,  and  r{n,p)  is  set  to  at 

™  2^+™  2(l-p) 

least  1. 

Any  failure  probability  p  <  \  —  can  be  expressed  asp  =  \—y  for  suitable  <y<\. 
It  can  be  seen  that; 


In - h  In  — - -  =  In - h  In - 

2p  2(1 -p)  l-2y  l  +  2y 


=  In 

> 


1 


1  -4y2 
4 


>  4y^  (noting  that  4y^  <  1  and  applying  Fact  1)  (9.1) 


(Inn) 
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Resultantly: 


(9.2) 


d  < 


Inn 


In  4  +  In  2(13^ 


—  < 


Inn 


(Inn) 


<  (Inn) 


(lnn)2 


Furthermore,  it  is  evident  that: 


In  Ti 

— ^ — h  6  In  In  n  <  In  n  —  4  In  In  n  for  sufficiently  large  n 


(9.3) 


Consider  a  particular  node  j  in  the  network.  From  Theorem  22,  it  follows  that  if  j  is 
non-faulty,  but  more  than  half  of  its  neighbors  are  faulty,  reliable  broadcast  will  fail  with 
probability  at  least 

We  know  that  there  are  d  neighbors  of  j,  and  each  may  fail  independently  with  prob¬ 
ability  p.  Let  /jfc(l  <  k  <  d)  denote  an  indicator  variable  corresponding  to  neighbor  k 
of  j  (enumerated  in  some  order),  such  that  Ijk  =  1  if  /c  is  faulty,  and  0  otherwise.  Then 

Yj  =  ^jk  denotes  the  number  of  failed  neighbors  of  j.  Y  takes  values  from  0, 1, ...,  d, 

k^nbd'  (j) 

and  E[Y]  =  pd.  Note  that  in  the  L^o  metric,  d  is  always  even,  and  d  >  8  for  all  r{n,p)  >  1. 


Also: 

PrK>^]= 


Let  us  simply  consider  the  event  Yj  =  ^.  Then  we  can  apply  the  lower  bound  from  Lemma 
56  as  follows:  the  variables  /jfc(l  <  k  <  d)  are  drawn  from  x  =  {0, 1}  as  per  distribution 
P  =  Bernoulli{p),  and  the  distribution  corresponding  to  Yj  =  ^  is  Bernoulli{\)  (we  shall 
refer  to  this  as  Qi).  \x\  =  2,  and  Thus,  we  obtain: 


Pr[W  >  -]  >  PrW  =  -]  >  - - 

^  -  2'  -  ^  ^  2^  -  (d-h  1)1x1 


-d(Z)(Qi||P)) 
e  2 


1 


-d(D(QillP)) 
rC  2 


(d  -I-  1)' 

2  -d(D(QillP))-21nd 


d2(l  +  1)2 


-d(P(Qi||P)) 
e  2 


2  -( 


^  3^ 


In  4-ln 


2(i-p) 

2  —  i  In  n—6  In  In  n 


(9.4) 


using  (9.2) 


3^ 


2(lnn)^  ,  . 

>  — ^ —  using  (9.3) 
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Let  us  denote  the  L.H.S.  of  the  above  equation,  i.e.,  Pr[l^-  >  |],  by  q. 


Pr[  j  non- faulty;  at  least  half  nbd{j)  faulty  ]  >  {1  —  p)q  > 


1 

2 


/  2(lnn)^\ 

V  3n  ) 


(Inn)^ 

3n 


(9.5) 


We  mark  out  a  subset  of  nodes  j  such  that  the  neighborhoods  of  these  nodes  are  all 
disjoint,  as  in  Fig.  9.1.  From  Fact  3,  the  number  of  such  nodes  that  we  may  obtain  is 
k  >  for  large  n  (from  (9.2),  d  =  o(n)).  In  fact,  it  is  not  hard  to  see  from  the  argument 
used  in  the  statement  of  Fact  3  that  the  number  of  such  nodes  would  exceed  ^  1  for 

large  enough  n.  We  can  designate  one  such  node  as  the  broadcast  source,  and  examine  the 
probability  that  any  of  the  remaining  nodes  (A:  >  ^  in  number)  can  be  made  to  commit  to 
the  wrong  broadcast  value. 

Let  Ij  be  an  indicator  variable  that  takes  value  1  if  a  node  j  is  non-faulty  and  has  at 
least  half  faulty  neighbors.  From  (9.5),  we  know  that  Pr[Ij  =  1]  >  .  Furthermore,  all 

the  Ij’s  are  independent. 

Let  Ij  be  an  indicator  variable  that  takes  value  1  if  j  is  non-faulty  but  commits  to  a 
wrong  value.  From  Theorem  22,  we  know  that  if  a  non-faulty  node  has  half  or  more  faulty 
neighbors,  it  can  be  made  to  commit  to  the  wrong  value  with  probability  at  least  Thus 
Pr[/^.  =  l]>iPr[/,  =  l]> 

Let  X  be  a  random  variable  indicating  the  number  of  non-faulty  nodes  with  half  or 
more  faulty  neighbors  that  commit  to  the  wrong  value.  Then  X  =  '^Ij,  and  E[X]  = 
^Pr[/'  =  1]  >  (^)  =  ^^^2!  ^  ^  {':  d  <  (Inn)^  from  (9.2)).  Therefore,  we  can 

choose  a  suitable  constant  0  <  /3  <  1  (e.g.,  /3  =  ^)  and  apply  the  Chernoff  bound  in  Lemma 
53  to  obtain: 


I3^E\X] 

Pr[X  >  (1  -  I5)E[X]]  >  1  -  e - ^ 


(9.6) 


lim  Pr[X  >  (1  -  I3)E[X]]  =  1  lim  E[X]  =  00 

n— >cxD  n— >00 


This  yields: 


lim  Pr[  reliable  broadcast  fails  ]  =  1 

n— >00 


□ 


In  light  of  the  prior  observation  about  the  necessity  of  r{n,p)  being  at  least  I  (i.e., 
d{n,p)  being  at  least  dmin),  and  the  result  of  Theorem  23,  it  follows  that  for  all  p  <  ^  —  j^, 
if  the  node  degree  is  less  than  m8iK{dmin,  , — 1  — i — },  reliable  broadcast  fails  w.h.p. 

™  2(l-p) 
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Figure  9.2:  Depiction  of  qnbdA,  qnbdR,  t^-  r  lj 

figure  9.3:  Depiction  ot  qnbdA',  qnbdB', 

qnbdc,  qnbdn 


9.5  Byzantine  Failures  in  a  Grid  Network:  Sufficient 
Condition 


We  now  state  and  prove  a  sufficient  condition  for  the  achievability  of  reliable  broadcast  in 
a  grid  network.  Intuitively,  the  approach  involves  showing  that  if  the  degree  of  a  node  is 
sufficiently  large,  then  the  node  can  look  at  messages  received  from  a  constant  fraction  of 
its  neighbors,  and  act  upon  the  majority  opinion  in  this  subset;  doing  so  will  enable  it  to 
correctly  determine  the  broadcast  value,  since  a  majority  of  the  nodes  in  that  subset  will 
be  non-faulty  w.h.p. 

Theorem  24.  Assuming  L^c  distance  metric,  in  a  grid  network  with  Byzantine  failure  prob¬ 
ability  p  <  I,  whenr{n,p)  is  ehosen  sueh  that  d{n,p)  =  4r^+4r  >  max\dmin,  16^ — i  i — | 

=  maxjdmin )  8  i^qY\\P)  ) ^  ■' 

3 

lim  Pr[  reliable  broadeast  is  achievable  ]  =  1 

n— >oo 

Note  that  when  In  ^+ln  2(i-p)  —  ^1^®  degree  expression  exceeds  the  total  network 

size  n,  the  sufficient  condition  ceases  to  be  relevant  (as  node  degree  d{n,p)  cannot  exceed 
n).  Note  that  such  a  value  of  d{n,p)  corresponds  to  a  transmission  range  r{n,p)  of  a  node 
spans  the  entire  network,  effectively  implying  that  the  network  is  single- hop;  due  to  the  local 
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Region 

x-extent 

y-extent 

qnbdA{a,  b) 

a  <  X  <  ia-\-  r) 

(6-r)  <  y  <  (6-  1) 

qnbdBia,  b) 

[a  —  r)  <  X  <  [a  —  1) 

{b  —  r)  <  y  <b 

qnbdc  {a,  6) 

(a  —  r)  <  X  <  a 

(b -\- 1)  <  y  <  {b -\- r) 

qnbdoia^  b) 

(a  -|-  1)  <  X  <  (a  -|-  r) 

b  <  y  <  ib-\-  r) 

qnbdA'{a,  b) 

(a  -|-  1)  <  X  <  (a  -|-  r) 

(b  —  r)  <  y  <b 

qnbdB'{a,  b) 

(a  —  r)  <  X  <  a 

(b-r)  <y  <{b-l) 

qnbdc  {a,  6) 

(a  —  r)  <  X  <  (a  —  1) 

b  <  y  <  ib-\-r) 

qnbdD'ia,  b) 

a  <  X  <  {a  +  r) 

(b -\- 1)  <  y  <  {b -\- r) 

Table  9.1:  Spatial  Extents  of  Quarter  Neighborhoods 


broadcast  assumption,  reliable  broadcast  is  trivially  achievable  in  a  single-hop  network. 

Therefore,  the  sufficient  condition  is  relevant  only  so  long  as  In  ^  +  In  2{i-p)  ^ 
and  this  is  the  case  that  we  consider. 

Case  1:  p  =  o{^)  By  application  of  the  union  bound,  the  probability  that  at  least  one 
node  fails  is  at  most  np.  Since  p  =  o(-),  therefore  lim  np  =  0.  Therefore,  the  probability 
that  no  node  fails  approaches  1  asymptotically,  and  reliable  broadcast  is  trivially  ensured 
w.h.p.  even  with  the  minimum  transmission  range  of  1. 

Case  2:  p  =  We  define  a  term  called  quarter-neighborhood  of  a  node  (x,y),  and 

denote  it  by  qnbd{x,y).  We  associate  eight  quarter-neighborhoods  with  each  node;  qnbdA, 
qnbds,  qnbdc,  qubdo,  qnbdA',  qnbds',  qnbdc',  qnbdo'-  The  quarter- neighborhoods  for  a 
node  (a,  b)  are  the  regions  depicted  in  Figs.  9.2  and  9.3,  and  their  spatial  extents  are 
tabulated  in  Table  9.1.  Observe  that  qnbdB{a,b)  =  qnbd'^{a  —  r  —  1,6),  qnbdc{a,b)  = 
qnbdA{a—r,b+r+l),  and  qnbdo {a,  b)  =  qnbd'^{a,b+r).  Similarly,  qnbdB'{a,b)  =  qnbdA{a— 
r,  6),  qnbdc  {a,  6)  =  qnbdA'{a  —  r  —  l,b  +  r),  and  qnbdD'ia,  b)  =  qnbdA{a,  b  +  r  +  1)  Thus, 
if  we  simply  consider  qnbdA{u)  and  qnbdA'{u)  for  all  nodes  u,  we  will  have  considered  all 
quarter- neighborhoods,  i.e.,  the  number  of  distinct  (but  not  disjoint)  quarter- neighborhoods 
is  2n.  Henceforth,  we  shall  sometimes  use  Q{x,y)  to  refer  to  qnbdA{x,y),  and  Q\x,y)  to 
refer  to  qnbdA'{x,y).  The  population  of  each  quarter-neighborhood  is  r(r  -|-  1).  Since 
d  =  4r^  -|-  4r  =  4r(r  -|-  1)  in  the  L^o  metric,  the  population  of  each  quarter-neighborhood 
is  We  now  state  and  prove  the  following  result  which  is  crucial  to  proving  our  sufficient 
condition  for  reliable  broadcast: 
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Lemma  44.  If  p  <  ^  and  d  >  m.scx.{dmin, 


lim  Pr[  V(x,y)  less  than  - 

n— >cxD  8 


^  ~  8  j)(q7||p) 

faults  in  Q{x,y)  and  Q'{x,y)]  =  1 


)},  then: 


Proof.  As  shown  above,  the  population  of  any  quarter-neighborhood  is  f.  Each  node  may 
fail  independently  with  probability  p.  Let  Y(x,y)  be  a  random  variable  denoting  the  number 
of  faulty  nodes  in  Q(x,y).  Then  ^  Ij  were  Ij  is  an  indicator  variable  which  is 

j£Q{x,y) 

1  if  neighbor  i  of  the  node  at  {x,y)  is  faulty,  and  is  0  otherwise. 

Noting  that  p  <  5,  we  can  apply  the  relative  entropy  form  of  the  Chernoff-Hoeffding 
bound  (Lemma  54)  to  Y(x  y).  Observe  that  d  >  maxjdmm,  16^ — 1  — i — |  >  16^ — 1  — i — . 

2(l-p)  ™  2(l-p) 

Thus,  we  obtain: 


Pr[y( 


d, 

>  -\<e 


x,y)  —  oJ  — 


-Kiln 


2p  +  2  In  2(lbp))  g 


-( 


4(ln 


2{r-p) 


11(2  In  2p“l"2  In  2fl-'D'|) 


2{l-p)^ 


=  e 


—2  Inn 


(9.7) 

Similarly,  setting  be  a  random  variable  denoting  the  number  of  faulty  nodes  in 

Q\x,y),  and  following  the  same  argument  as  above,  we  obtain  that: 


(9.8) 

(9.9) 

□ 

We  now  consider  a  simple  broadcast  protocol  that  is  similar  to  the  protocol  that  was 
described  in  [57]  for  the  locally  bounded  model: 

•  Initially,  the  source  does  a  local  broadcast  of  the  message. 

•  Each  neighbor  i  of  the  source  immediately  commits  to  the  the  first  value  v  it  heard 
from  the  source,  and  then  locally  broadcasts  it  once  in  a  COMMITTED{i,  v)  message. 

•  Hereafter,  the  following  protocol  is  followed  by  each  node  j  ^  nbd{s): 


Pr[Ka;,y)  >  g]  <  ^2 

By  application  of  union  bound  over  all  2n  distinct  quarter- neighborhoods: 

.•.Pr[V(x,y),y(x,y)  <  ^  and  Y'{x,y)  <  ^]  >  1  -  2n  ^  ^  ^ 

lim  Pr[V(x,y),y(x,y)  <  ^  and  Y'{x,y)  <  ^]  =  1 
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If  |r(r  +  1) +  1  =  1  +  1  COMMITTED{i,v)  message  are  received  for  a  certain  value 
V,  from  neighbors  i  all  lying  within  a  single  quarter-neighborhood,  and  not  already 
committed  to  some  value,  commit  to  v,  and  locally  broadcast  a  COMMITTED{j,  v) 
message.^ 

Theorem  25.  The  probability  that  a  non-faulty  node  shall  commit  to  a  wrong  value  by 
following  the  above  protocol  tends  to  0  as  n  ^  oo. 

Proof.  If  all  Q{x,y)  and  Q'{x,y)  have  strictly  less  than  |  faults,  the  correctness  of  the 
protocol  proceeds  as  follows: 

By  the  assumptions  of  reliable  local  broadcast,  if  s  sends  exactly  one  message,  fault-free 
nodes  in  nbd{s)  are  guaranteed  to  receive  it  correctly.  If  s  is  faulty  and  sends  more  than 
one  version  of  the  message,  fault-free  nodes  in  nbd{s)  receive  both  messages,  and  select  the 
first  one.  Thus  fault-free  nodes  in  nbd{s)  are  guaranteed  to  commit  to  the  correct  value. 

The  rest  of  the  proof  is  by  contradiction.  Consider  the  first  fault-free  node,  say  j,  that 
makes  a  wrong  decision  to  commit  to  a  value  v.  From  our  previous  assertion,  j  cannot 
be  in  nbd{s),  and  hence  followed  protocol  rules  for  nodes  that  are  not  s’s  neighbors.  This 
implies  that  |  + 1  of  its  neighbors  within  some  quarter-neighborhood  must  have  broadcast  a 
COMMITTED  message  for  v  (the  COMMITTED  messages  were  directly  heard,  leaving  no 
place  for  doubt).  All  of  these  nodes  cannot  be  faulty,  as  less  than  |  nodes  in  any  quarter- 
neighborhood  are  faulty.  Thus,  there  was  at  least  one  fault-free  node  that  committed  to  v. 
Since  j  is  the  first  fault-free  node  to  make  a  wrong  decision,  none  of  the  fault-free  nodes 
amongst  the  g  +  I  nodes  could  have  made  a  wrong  decision.  Therefore,  v  must  indeed  be 
the  correct  value. 

From  Lemma  44,  all  the  quarter-neighborhoods  have  less  than  |  faults  with  a  probability 
that  tends  to  I  as  n  ^  oo,  and  hence  the  protocol  also  functions  correctly  with  a  probability 
that  tends  to  1  as  n  ^  oo.  □ 

Theorem  26.  Each  non-faulty  node  is  eventually  able  to  commit  to  the  correct  value  w.h.p. 
Proof.  The  proof  is  by  induction. 

^Note  that  |  =  is  always  an  integer,  since  r  is  assumed  to  take  only  integer  values  in  the  grid 

network  case. 
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Figure  9.4:  Node  u  has  a  quarter-neighborhood  contained  in  nbd{a,  b) 

Base  Case:  All  non-faulty  nodes  in  nbd{0,  0)  are  able  to  commit  to  the  correct  value.  This 
follows  trivially  since  they  hear  the  source  directly,  and  by  assumption  address-spoofing  is 
impossible. 

Inductive  Hypothesis:  If  all  non-faulty  neighbors  of  a  node  located  at  (a,  b)  i.e.  all 
non-faulty  nodes  in  nbd{a,  b)  are  able  to  commit  to  the  correct  value,  then  all  non-faulty 
nodes  in  pnbd{a,  b)  are  able  to  commit  to  the  correct  value. 

Proof  of  Inductive  Hypothesis:  We  show  that  each  node  u  in  pnbd{a,  b)  \  nbd{a,  b)  has 
at  least  one  of  qnbdA{u),  qnbdsiu),  qnbdc{u),  qubdciu),  qnbdA'{u),  qnbdB>{u),  qnbdc'{u), 
qnbdo'iu)  fully  contained  in  nbd{a,  b).  Since  the  population  of  each  quarter-neighborhood  is 
j ,  and  strictly  less  than  g  of  the  nodes  in  a  quarter-neighborhood  are  faulty  with  probability 
that  tends  to  1  asymptotically,  the  number  of  non-faulty  nodes  in  each  quarter-neighborhood 
is  at  least  |  + 1  (since  |  is  always  an  integer) .  This  ensures  that  the  node  will  become  aware 
of  |-|-1  nodes  in  nbd{a,  b)  having  committed  to  a  (the  correct)  value,  and  will  also  commit  to 
it  (if  it  is  non-faulty).  The  situation  is  depicted  in  Fig.  9.4  for  u  G  {{a—r+l,  6-|-r-|-l)|l  <  I  < 
r},  for  which  qnbdA^u)  lies  in  nbd{a,b).  For  other  locations,  a  similar  argument  holds.  □ 
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9.6  Byzantine  Failures  in  a  Random  Network:  Sufficient 


Condition 

We  obtain  a  sufficient  condition  for  a  network  of  n  nodes  deployed  uniformly  at  random, 
based  on  the  sufficient  condition  for  the  grid  network  model.  To  maintain  consistency 
with  the  grid  network  formulation,  we  again  assume  a  toroidal  region  of  area  ^/n  x  ^/n, 
with  n  nodes  located  uniformly  at  random.  The  average  degree  of  a  node  is  the  average 
number  of  the  remaining  n  —  1  nodes  that  fall  within  its  neighborhood.  Recall  that  we 
are  using  Loo  distance  metric),  and  thus  the  average  degree  is  davg{n,p)  =  = 

4r^(n,p)(l  —  ^)  4r^(n,p)  for  large  n. 

Theorem  27.  Assuming  the  L^o  metric,  in  a  random  network  with  Byzantine  failure  prob¬ 
ability  p  <  \,  and  r{n,p)  > 

lim  Pr[  reliable  broadcast  succeeds  ]  =  1 

n— >oo 


Proof.  We  begin  with  the  observation  that  if  r{n,p)  becomes  so  large  that  a  node’s  range 
spans  the  entire  network,  all  nodes  are  neighbors,  and  trivially  broadcast  is  achievable. 
Thus,  this  result  is  of  interest  only  so  long  as  r{n,p)  is  not  so  large. 

In  light  of  Fact  1 : 


(9.10) 


(9.11) 


Similar  to  grid  networks,  we  use  a  notion  of  quarter-neighborhoods.  For  a  given  broad¬ 
cast  instance,  we  again  use  relative  coordinates  by  treating  the  source’s  coordinates  as  (0, 0). 
With  some  abuse  of  the  grid  network  notation  introduced  in  Section  9.1,  we  can  extend 
the  notion  of  nbd{x,y),  to  include  all  nodes  within  distance  r  of  point  {x,y)  (regardless  of 
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whether  or  not  there  is  a  node  at  {x,y)),  where  x  and  y  are  real  numbers.  The  notion  of 
pnd{x,y)  is  also  similarly  extended  to  all  points  {x,y). 

Note  that  in  this  model,  a  node’s  (or  point’s)  coordinates  are  real  numbers.  We  thus 
associate  eight  quarter-neighborhoods  with  each  node,  with  spatial  extents  as  in  Table  9.1, 
except  that  now  x  and  y  must  be  treated  as  real  numbers.  Also,  now  it  is  not  possible 
to  assert  that  there  are  only  2n  distinct  quarter-neighborhoods.  Thus,  all  eight  quarter- 
neighborhoods  of  a  node  must  be  treated  as  distinct^,  yielding  8n  quarter- neighborhoods 
in  all. 

2  /  \ 

The  quarter- neigborhoods  are  axis-parallel  rectangles  of  area  r{n,p){r{n,p)  —  l)  >  ^  2’^ 
(for  r{n,p)  >  2).  Then,  if  r‘^(n,p)  >  -j — ^ ^  then  we  can  apply  Lemma  58  for  all 

2  “P+2  ™  2(l-p) 

axis-parallel  rectangles  of  area  r{n,p){r{n,p)  —  1)  >  t — „  m 1 —  >  1  to  obtain 

2  “P+2  ™  2(l-p) 

that  they  all  have  at  least  ^ — „  ,^1 !° ^  1 - 50 Inn  >  t — 1 —  >  nodes,  with 

2  “P+  2  ™  2(l-p)  2  “P+  2  ™  2(l-p) 

probability  at  least  1  —  ^  i 

Thus  all  such  rectangles  are  non-empty.  Also: 


1 

2 


25  In  n  25  In  n  8  In  n 

p  +  h^^2(i-p)  ~  ^iQiWp) 


(9.12) 


Hence,  all  the  quarter-neighborhoods  have  at  least  nodes  (which  is  the  quarter- 

neighborhood  population  in  the  grid  network  case).  Then  using  a  proof  argument  similar 
to  Lemma  44,  one  can  prove  the  following  result; 

Lemma  45.  If  p  <  and  r{n,p)  >  ^ /i  IT/””' 1  ;  then 

y  2  P+2  2(1 -p) 


lim  Pr[  all  8n  qnbds  have  non-faulty  majority  ]  =  1 

n— >00 


In  light  of  this,  one  can  use  a  broadcast  protocol  similar  to  that  for  grid  networks  (a 
node  commits  to  a  value  if  it  is  received  from  a  majority  of  the  nodes  in  some  quarter- 
neighborhood),  and,  for  all  broadcast  sources,  and  instances,  the  reliable  broadcast  prop¬ 
erties  continue  to  hold,  as  follows: 

Relying  on  Lemma  45,  we  can  apply  a  proof  argument  similar  to  Theorem  25  to  argue 
that  with  high  probability  no  non-faulty  node  will  commit  to  a  wrong  value. 

We  can  also  show  that  each  non-faulty  node  will  eventually  be  able  to  commit  to  the 
®Note  that  distinct  does  not  mean  disjoint. 
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correct  value  w.h.p.  The  proof  is  by  induction,  similar  to  the  proof  of  Theorem  26,  except 
that  the  terms  nbd{x,y),pnd{x,y)  must  be  interpreted  as  per  their  re-definition  in  this 
section  (i.e.,  the  region  within  distance  r  of  a  point  (x,y),  regardless  of  whether  there  is  a 
node  at  that  point). 

In  the  base  case,  all  neighbors  of  the  source  (which  is  at  (0,0))  commit  to  the  correct 
value  trivially.  In  the  inductive  step,  one  can  show  that  if  all  nodes  in  nbd{x,  y)  (as  per  the 
re-defined  notation)  have  comitted  to  the  correct  value,  all  nodes  in  the  region  pnd{x,  y)  \ 
nbd{x,y)  have  some  quarter-neighborhood  contained  in  nbd{x,y),  and  can  commit  to  the 
value  received  from  a  majority  of  nodes  in  this  quarter-neighborhood.  □ 

Since  the  area  within  range  of  a  node  is  at  most  4r^  (for  the  valid  domain  of  r  values) 

in  the  Loo  metric,  the  result  indicates  that  an  average  node  degree  dava  of  t — 400inn  ^ 

2-P+2'""2(T^ 

fices  for  reliable  broadcast.  Hence  the  critical  average  node  degree  is  0(t — — i — ). 

2  “P+2  ™  2(l-p) 

A  more  intuitive  way  of  viewing  the  result  is  that  critical  average  degree  in  a  random  net¬ 
work  is  0(max{lnn,  g(Q^'||p)})  or  0{lnn+  DiQ^WP))- 

2  2 

9.7  Crash-Stop  Failures  in  a  Grid  Network 

We  now  consider  the  achievability  of  reliable  broadcast  in  a  grid  network  when  nodes  may 
cease  to  function  with  probability  p.  This  is  equivalent  to  the  network  being  connected 
despite  failures.  Our  results  for  this  scenario  improve  upon  prior  results  by  Shakkottai  et 
ah,  in  [101]. 

Theorem  28.  In  a  grid  network  where  nodes  can  exhibit  crash-stop  failure  with  probability 

P  <  1  -  ifr{n,p)  <  max{I,  L  r^}: 

y  p 

lim  Pr[  disconnection  ]  =  1 


Proof.  Evidently  the  minimum  transmission  range  required  for  connectivity  is  at  least  1, 
corresponding  to  d  =  dmin  =  8  (in  Lqo  metric),  else  the  degree  of  all  nodes  is  0  (except 
in  the  case  when  all  nodes  are  faulty,  and  connectivity  becomes  irrelevant).  Thus,  we  only 


focus  on  the  case  where  >  1.  In  this  scenario  r{n,p)  <  max{I,  ^  }  implies  that 

V 


r{n,p)  < 
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We  show  that  when  p  <  1 
probability  approaching  1  if  r  < 


—  the  network  is  asymptotically  disconnected  with 


In  the  Loo  metric,  having  r{n,p)  <  | 

o„2  ^  Inn 

^  21nl  • 

P 

Consider  a  particular  node  j  in  the  network.  If  j  is  non-faulty,  but  all  its  neighbors  are 
faulty,  we  have  a  potential  disconnection  event.  Given  that  there  are  d  neighbors,  and  each 
may  fail  independently  with  probability  p,  the  probability  that  j  does  not  fail,  but  all  nodes 
in  nbd{j)  fail,  is  (1  —p)p^. 

Since  p  <  1  —  we  obtain  that; 


yields  a  node  degree  d{n,p)  =  4r^  +  4r  < 


l-p>  - 

In  re 


(9.13) 


Pr[  A  given  node  j  is  non-faulty,  but  isolated] 

=  Pr[j  is  non-faulty  and  all  neighbors  of  j  are  faulty  ] 

(Inre)^ 

>  - for  large  re 

re 


Note  the  following: 


(9.14) 


d< 


Inre 

- r  < 

21ni  “ 

p 


In  re 


2(1 -p) 


< 


(Inre)^ 


(Fact  1,  (9.13)) 


(9.15) 


Let  us  mark  out  a  subset  of  nodes  j  such  that  the  neighborhoods  of  these  nodes  are  all 
disjoint,  as  in  Fig.  9.1.  Then,  from  Fact  3,  the  number  of  such  nodes  that  we  may  obtain 
is  at  least  ^  for  large  re. 

Let  Ij  be  an  indicator  variable  that  takes  value  1  if  j  is  non-faulty  but  isolated.  Then 
Pr[Ij  =  1]  >  ,  and  all  /j’s  are  i.i.d. 

Let  A  be  a  random  variable  denoting  the  number  of  nodes  from  the  chosen  set  that  are 
non-faulty  and  isolated.  Then  X  =  Y^Ij,  and  E[X]  >  (|g)  >  =  In  re.  We  can 

thus  set  /3  =  ^  in  the  Chernoff  bound  of  Lemma  53,  and  obtain  that: 


Pr[A  > 


In  re 


In  n 

>  1  -  e““^  =  1  - 


1 

r 

res 


(9.16) 
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Thus,  for  p  <  1  —  and  large  n: 

lim  Pr[  At  least  two  non-faulty  nodes  are  isolated]  =  1  (9-17) 

n— >oo 

Hence  a  broadcast  from  one  such  node  will  not  be  received  by  the  other  node.  □ 

We  also  briefly  touch  upon  the  range  of p  values  satisfying  1— p  =  o(^).  By  applying 
the  union  bound,  the  probability  that  at  least  one  node  is  non-faulty  is  at  most  n(l  —p). 
Since  1  —  p  =  o  (-) ,  we  know  that  lim  n(l  —  p)  =0.  Therefore: 

lim  Pr[  all  nodes  are  faulty  ]  =  1  (9.18) 

n— >oo 


Thus  the  issue  of  connectivity  is  irrelevant. 

We  now  present  a  sufficient  condition  for  the  asymptotic  connectivity. 

Theorem  29.  In  a  grid  network  with  crash-stop  failure  probability  p  <  1,  when  r{n,p)  > 
max{l,y^}.- 

lim  Pr[  the  network  is  connected  ]  =  1 

n— >oo 

Proof,  p  =  o(^) 

When  the  failure  probability  is  so  small  as  to  fall  in  this  range,  by  applying  the  the 
union  bound,  we  obtain  that  the  probability  of  even  a  single  node  failing  is  at  most  np. 
Since,  lim  np  =  0,  asymptotic  connectivity  is  trivially  ensured  even  with  the  minimum 

n— >oo 

transmission  range  of  1. 

Note  that  when  p  is  H(^),  then  r{n,p)  >  1  for  large  enough  n.  Consider  the  subdivision 

of  the  grid  as  depicted  in  Fig.  9.5,  so  that  the  resulting  cells  have  x-extents  (and  also 

y-extents)  0  to  a,  a  -|-  1  to  a  -|-  6,  a  -|-  6  -|-  1  to  2a  -|-  6  -|-  1,  2a  -|-  6  -|-  2  to  2a  -|-  26  -|-  1,  and 

so  on,  where  a  =  [|J  and  6  =  r  —  a  =  r  —  [|J .  It  is  easy  to  see  that  each  node  is  within 

range  of  all  other  nodes  in  the  cells  adjoining  its  own  (as  depicted  in  Fig.  9.5).  If  each  cell 

has  at  least  one  non-faulty  node,  there  exists  a  connected  backbone  that  covers  all  points, 

and  hence  all  nodes.  Therefore,  all  non-faulty  nodes  are  connected  to  each  other  via  this 

backbone.  The  populations  of  the  cells  thus  obtained  can  be  (a  -j-  1)^,  (a  -|-  1)6  or  6^.  Since 

2 

a-|-l=[y-|-l>^,  and  6  =  r  —  the  population  k  of  any  cell  satisfies  k  >  ^,  and 
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Figure  9.5:  Subdivision  of  network  into  cells  (all  adjacent  cells  are  within  range) 
the  maximum  possible  number  of  cells  m  <  ^.  Then: 

Pr[  no  non-faulty  node  in  a  given  cell  ]  =  p  <  p  ^  (9.19) 


Since  r  > 

2  2  In  n 

Pr[  no  non-faulty  node  in  a  given  cell  1  <  p^  <  p  (9.20) 

The  total  number  of  cells  is  at  most  <  4n  since  r  >  1  (however  note  that  ^  is  actually 
less  than  n  for  large  enough  n,  whenever  p  =  17(4)).  Applying  a  union  bound  over  all  cells: 

Pr[  at  least  1  non-faulty  node  in  each  cell  ]  > 

Since  this  condition  ensures  connectivity,  we  obtain  that: 

lim  Pr[  network  is  connected  ]  =  1 

n— >00 


1--  (9.21) 

n 


(9.22) 


□ 
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Figure  9.6:  Relationship  between  Loo  and  L2  neighborhoods 


9.8  Conditions  in  Euclidean  Metric 

We  show  that  our  results  derived  for  Loo  metric  continue  to  hold  for  L2  metric,  with  only 
the  constants  in  the  theta  notation  changing. 

Lemma  46.  If  reliable  broadcast  is  achievable  asymptotically  in  Loo  for  all  r  >  rmin,  then 
it  is  achievable  asymptotically  in  L2  for  all  r  >  \/2rmin- 

Proof.  The  proof  is  by  contradiction.  Suppose  that,  for  a  given  failure  configuration,  broad¬ 
cast  is  asymptotically  achievable  in  Loo  for  all  r  >  rmin  but  is  not  asymptotically  achievable 
for  all  r  >  Vermin  m  L2.  Observe  that  it  is  possible  to  circumscribe  a  Loo  neighborhood 
of  range  r  by  a  L2  neighborhood  of  range  \/2r  (Fig.  9.6).  Hence  the  non-faulty  nodes  in 
an  L2  network  of  transmission  range  V2r  can  be  made  to  simulate  the  operation  of  nodes 
in  a  Loo  network  with  range  r  (as  the  Loo  neighborhood  is  fully  contained  within  the  L2 
neighborhood).  Also,  given  that  all  nodes  in  the  L2  network  know  the  locations  of  their 
neighbors,  and  no  address  spoofing  is  allowed,  the  faulty  nodes  (in  the  Byzantine  failure 
case)  cannot  gain  any  unfair  advantage  by  not  simulating  the  the  Loo  network.  If  there  is 
some  r  >  rmin  for  which  we  can  achieve  broadcast  in  the  Loo  network  asymptotically,  but 
not  in  the  the  L2  network  of  range  V^r,  we  obtain  a  contradiction,  as  achievability  in  the 
Loo  network  would  imply  achievability  in  the  L2  network.  This  implies  that  if  broadcast  is 
achievable  in  the  Loo  network  of  range  r  ,  so  must  it  be  in  the  L2  network  of  range  \/2r.  □ 

Lemma  47.  If  reliable  broadcast  fails  asymptotically  in  Loo  for  all  r  <  rmin,  then  it  fails 
asymptotically  in  L2  for  all  r  <  rmin- 

Proof.  The  proof  is  by  contradiction.  Suppose  that  broadcast  fails  asymptotically  in  Loo 
for  range  r,  but  does  not  fail  in  L2  for  range  r.  Observe  that  an  Loo  neighborhood  of 
transmission  range  r  circumscribes  an  L2  neighborhood  of  range  r  (Fig.  9.6).  Thus,  for  any 
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given  failure  configuration,  if  broadcast  succeeds  in  the  the  network  of  range  r,  so  can 
it  in  the  L^o  network  of  radius  r,  as  we  could  simply  make  the  fault-free  nodes  in  the  L^o 
network  simulate  the  behavior  of  nodes  in  the  L2  network.  Hence,  if  broadcast  does  not  fail 
in  the  L2  network  of  range  r  <  Tmim  it  will  not  fail  in  the  L^o  network  of  range  r  <  Tmin- 
This  yields  a  contradiction.  □ 

9.9  Non- Toroidal  Networks 

We  used  the  assumption  that  the  network  is  toroidal  to  avoid  edge  effects.  However,  it  can 
be  seen  that  the  results  (in  terms  of  transmission  range  r{n,p))  would  continue  to  hold  even 
if  the  network  were  spread  over  a  non-toroidal  rectilinear  domain.^ 

The  necessary  condition  would  continue  to  hold,  since  the  area  within  transmission 
range  at  the  edges  can  be  no  more  more  than  the  area  within  transmission  range  (and 
hence  degree)  of  nodes  towards  the  center,  and  if  reliable  broadcast  is  not  achievable  for  a 
certain  value  of  r(n,p)  even  with  the  assumption  that  all  nodes  have  equal  network  area 
within  their  transmission  range,  then  it  must  certainly  be  impossible  when  some  nodes 
(those  near  the  edges  of  the  network  region)  have  a  smaller  area  within  range. 

The  sufficient  conditions  for  Byzantine  failures  continue  to  hold  since  the  described 
algorithms  rely  on  information  from  quarter-neighborhoods,  and  it  can  be  seen  that  even 
the  nodes  at  the  edges  have  at  least  one  quarter-neighborhood  within  the  network  region. 
Hence,  if  some  value  of  r(n,p)  suffices  in  a  toroidal  network,  the  same  would  suffice  in  the 
corresponding  non-toroidal  network  as  well.  For  the  crash-stop  failure  case,  the  sufficient 
condition  continues  to  hold  as  even  nodes  at  the  edges  have  at  least  one  full  cell  within 
their  range. 

9.10  Discussion 

An  interesting  observation  is  that  the  form  of  the  results  for  Byzantine  failures  is  very 
similar  to  the  results  for  crash-stop  failures/connectivity.  For  Byzantine  failures,  we  have 
obtained  that  the  critical  node  degree  for  grid  networks  is  ©(dmm  +  which 

may  be  re-stated  as  Q{dmin  +  d(q^i^|p) )  where  Qi  denotes  the  Bernoulli{^)  distribution, 

^Note  that  the  degree  in  a  non-toroidal  network  is  a  function  of  node  location;  hence  it  is  more  relevant 
to  state  results  in  terms  of  transmission  range  r(n,p)  instead  of  degree. 
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P  denotes  the  Bernoulli{p)  distribution,  and  D{Q\\P)  denotes  the  relative  entropy  (or 
Kullback-Leibler  distance)  between  distributions  Q  and  P.  Similarly,  the  node  degree 
for  crash-stop  failures/connectivity  is  Q{dmin  +  be  viewed  as  as  0(dmin  + 

P 

^(qJ|p)  ),  where  Qi  is  the  Bernoulli(l)  distribution,  and  P  is  the  Bernoulli{p)  distribution 
(using  the  standard  convention  that  0  In  =  0  where  1  —  p  >  0  is  a  valid  probability 
value).  These  results  have  a  similar  structural  form,  involving  a  minimum  term  required 
for  connectivity  without  faulty  behavior,  and  a  second  term  required  to  ensure  broadcast 
even  in  presence  of  failure. 

Recall  that  we  derive  the  necessary  condition  from  isolated  failure  events,  and  this  is 
found  to  match  the  sufficient  condition  within  a  constant  factor.  Thus,  it  is  possible  that 
failure  events  involving  isolated  nodes  not  determining  the  correct  broadcast  value  may  be 
the  dominant  failure  events 

Focusing  on  these  isolated  failure  events,  the  obtained  expressions  for  node  degree  can 
be  explained  in  the  light  of  Sanov’s  Theorem  [24].  As  per  Sanov’s  Theorem,  the  prob¬ 
ability  of  occurrence  of  the  event-set  £  =  {  half  or  more  neighbors  faulty}  is  dominated 
by  the  probability  of  the  event  in  £  closest  in  relative  entropy  to  the  governing  fault 
distribution  P.  Since  we  are  considering  the  regime  p  <  |,  the  closest  event  is  that  of 

exactly  half  the  neighbors  being  faulty,  corresponding  to  Qi.  In  light  of  this,  the  critical 

2 

degree  expression  for  Byzantine  failures  is  quite  intuitive.  One  can  similarly  explain  the 
crash-stop  results. 

The  necessary  and  sufficient  condition  for  connectivity  in  a  sensor  network  where  nodes 
sleep  with  probability  p  was  shown  in  [55]  to  be  (when  expressed  in  our 

notation)  for  the  case  of  a  randomly  deployed  network.  This  problem  is  equivalent  to  that 
of  crash-stop  failures  in  random  networks.  Our  sufficient  condition  for  random  networks 
with  Byzantine  failure  probability  p  <  ^  is  0(t — i — ). 

2-P+2  in  2(l-p) 

There  is  a  similarity  of  form  in  the  two  results,  and  one  may  interpret  the  critical  node 
degree  as  being  0(lnn(l  —  p)  +  )  where  Q  is  the  Bernoulli{q)  distrbution,  and  P 

is  the  Bernoulli{p)  distribution;  q  =  I  (and  p  <  1)  for  the  sleeping/crash-stop  case  in  [55], 
and  q  =  \  (with  p  <  ]^)  for  the  Byzantine  failure  case. 

Additionally,  it  is  evident  that  our  expressions  for  the  grid  network  and  random  network 

®Note  that  in  [42],  it  was  found  that  the  primary  disconnection  events  in  non-faulty  random  networks 
are  those  involving  single  isolated  nodes. 
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diverge  when  p  ^  0,  but  are  otherwise  within  a  constant  factor  of  each  other  (for  p  bounded 
away  from  0).  This  difference  is  quite  intuitive.  In  a  grid  network,  as  failure  probability 
p  ^  0,  the  network  tends  towards  a  deterministic  topology,  whereas  in  a  random  network, 
if  failure  or  sleep  probability  p  ^  0,  the  network  can  only  tend  towards  a  denser  but  still 
random  network.  Thus,  at  small  values  of  p,  a  very  small  degree  will  suffice  for  a  grid 
network,  but  may  not  for  a  random  network.  At  larger  p  values,  the  grid  network  exhibits 
increasing  randomness  and  begins  to  resemble  a  network  with  random  deployment.  Thus, 
one  may  see  that  the  two  expressions  are  within  constant  factor  of  each  other  when  p  is 
large  (given  sufficiently  large  n),  but  diverge  as  p  ^  0. 
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Chapter  10 

Reliable  Local  Broadcast  with 
Byzantine  Failures 


In  Chapter  7,  we  briefly  reviewed  results  in  the  literature  on  achieving  reliable  broadcast 
in  wireless  networks.  In  Chapter  8  and  Chapter  9,  we  described  results  for  achievability  of 
reliable  broadcast  under  different  assumptions  regarding  the  network  and  fault  model.  Our 
results  in  these  previous  chapters,  as  well  as  a  substantial  amount  of  the  prior  work  reviewed 
in  Chapter  7,  assumes  that  if  a  node  transmits  a  message  it  is  received  by  each  and  every 
node  within  a  designated  neighborhood  in  its  spatial  vicinity.  This  eliminates  the  potential 
for  duplicity  by  a  Byzantine  source  node,  and  ensures  local  agreement.  While  this  model 
reflects  the  shared  nature  of  the  wireless  medium,  it  fails  to  capture  its  unreliability.  The 
wireless  medium  can  be  extremely  unreliable,  and  can  show  highly  variable  channel  quality 
over  time,  due  to  multipath  effects.  This  can  lead  to  significant  fluctuation  in  the  received 
signal.  Resultantly,  there  is  often  a  non-negligible  probability  of  unsuccessful  reception, 
even  in  the  absence  of  malicious  collision-causing  behavior.  Thus,  any  attempt  at  designing 
reliable  broadcast  protocols  based  on  these  theoretical  results  must  begin  with  an  effort  to 
implement  a  reliable  local  broadcast  primitive  in  a  scalable  manner. 

One  might  envision  implementing  local  broadcast  by  running  a  point-to-point  Byzantine 
agreement  protocol,  with  retransmissions  over  every  lossy  (point-to-point)  link  to  handle 
channel  errors.  However,  such  a  solution  may  not  be  scalable,  as  the  underlying  medium  is 
shared  and  thus  the  operation  of  nearby  (point-to-point)  links  cannot  occur  concurrently, 
and  must  be  serialized. 

While  the  issues  of  reliable  broadcast  and  consensus  in  the  presence  of  a  bounded  number 
of  collisions/spoofing  have  been  addressed  in  recent  years,  such  as  [58]  and  [38],  probabilistic 
channel  losses  have  typically  not  been  considered.  Random  transient  Byzantine  failures  that 
include  collision-causing  is  examined  in  [91].  Though  also  of  a  probabilistic  nature,  their 
model  is  different  in  that  nodes  either  fail  to  transmit,  transmit  a  wrong  value  or  transmit 
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out  of  turn,  with  a  certain  probability,  in  each  round. 

In  this  chapter,  we  investigate  the  possibility  of  designing  Byzantine  fault-tolerant  com¬ 
munication  primitives  that  can  work  in  the  presence  of  channel  unreliability.  We  continue 
to  assume  that  the  physical(PHY)  and  medium-access  control  (MAC)  layers  are  fault-free 
(i.e.,  nodes  do  not  deliberately  cause  collision  or  spoof  MAC  addresses).  Our  primary  intent 
is  to  highlight  the  potential  for  lightweight  scalable  solutions  that  exploit  knowledge  of  phys¬ 
ical  layer  characteristics,  in  conjunction  with  other  information  provided  by  lower  layers, 
to  achieve  message-ordering  conditions  useful  for  reliable  communication.  We  sketch  out 
a  simple  proof-of-concept  algorithm  that  can  facilitate  the  implementation  reliable  local 
broadcast  with  probabilistic  guarantees  in  a  local  broadcast  domain.  We  also  briefly  discuss 
how  the  proposed  reliable  local  broadcast  solution  can  be  optimized  further,  and  also  be 
used  as  a  sub-protocol  in  a  global  broadcast  algorithm  for  multi- hop  networks. 

A  preliminary  version  of  the  work  described  in  this  chapter  was  reported  in  [10]. 

10.1  How  a  Lossy  Wireless  Channel  Inhibits  Reliable  Local 
Broadcast 

In  this  section  we  briefly  discuss  how  an  unreliable  wireless  channel  can  affect  the  achiev- 
ability  of  reliable  local  broadcast. 

Consider  a  source  s  that  originates  a  message,  which  needs  to  be  locally  broadcast  to  its 
neighbors.  However,  as  the  channel  is  lossy,  each  neighbor  successfully  receives  the  message 
only  with  a  certain  probability.  Resultantly,  it  is  possible  that  a  transmission  may  only  be 
heard  by  some  subset  of  s’s  neighbors.  If  s  is  non-faulty,  this  issue  can  be  readily  resolved  by 
having  s  retransmit  the  message  a  sufficient  number  of  times  to  ensure  that  each  neighbor 
receives  at  least  one  copy  with  high  probability  (w.h.p.).  However  consider  what  might 
transpire  if  s  is  faulty,  and  seeks  to  leverage  the  channel’s  unreliability  to  create  confusion 
amongst  its  neighbors: 

Suppose  that  s  initially  sends  a  message  m  with  value  0.  Some  of  its  neighbors  do  not 
receive  it,  i.e.,  it  is  received  by  some  subset  A/i  of  s’s  neighbors.  It  then  sends  another 
version  of  the  same  message,  containing  a  value  I.  This  message  is  received  by  some  subset 
A/2.  If  A/i  \  A/2  is  non-empty,  there  are  certain  nodes  that  will  assume  that  s  sent  only 
one  value,  i.e.,  0.  If  A/2  \  A/)  is  non-empty,  there  are  certain  nodes  that  will  assume  that  s 
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sent  only  one  value,  i.e.,  1.  Nodes  in  A/i  n  A/2  receive  both  values,  and  are  in  a  position  to 
detect  s’s  duplicity.  These  nodes  can  choose  a  default  value,  e.g.,  the  first  value  sent  by  s. 
However,  there  still  remains  the  issue  of  ensuring  that  the  other  nodes  do  the  same.  One 
approach  might  consist  in  the  raising  of  an  alarm  by  nodes  in  A/i  H  A/2  ,  but  would  require 
a  means  for  the  other  nodes  to  resolve  whether  the  alarm(s)  are  to  be  trusted.  Another 
possible  approach  involves  using  a  point-to-point  Byzantine  agreement  algorithm  in  the 
neighborhood  of  s.  However,  these  approaches  have  high  message-overhead.  In  particular, 
given  the  shared  nature  of  the  wireless  medium,  the  messages  must  be  sent  in  turn  on  the 
same  medium,  thereby  exacerbating  the  cost. 

Thus,  one  may  prefer  to  have  a  more  lightweight  approach  to  ensure  agreement  of  all 
nodes  on  a  common  value  (and  potentially  rely  on  the  fact  that  after  a  number  of  duplicitous 
transmissions  by  s,  all  nodes  would  at  some  time  detect  its  duplicity  themselves,  and  s  would 
be  universally  identified  as  untrustworthy). 

10.2  Causal  Ordering  and  Physical  Clocks 

In  this  section,  we  briefly  review  notions  of  clocks  and  ordering  that  are  relevant  to  the 
discussion  in  this  chapter. 

We  assume  the  existence  of  some  frame  of  reference  external  to  the  system.  The  physical 
time  in  this  frame  of  reference  is  considered  to  be  an  absolute  measure  of  physical  time  for 
the  purpose  of  our  discussion.  Thus,  at  time  instant  t,  the  external  clock  value  is  t. 

Each  node  u  in  the  system  has  its  own  physical  clock.  The  clock  value  of  a  node  u  at 
time  instant  t  is  denoted  by  Cu  (t)  ■  When  we  refer  to  external  synchronization  within  bound 
D,  we  imply  synchronization  to  this  ideal  external  clock  within  bound  D,  i.e.,  at  each  time 
instant  t:  \Cuit)  —  t\  <  D. 

Clock  drift  is  modeled  as  being  linear,  i.e.,  if  the  true  elapsed  time  is  T,  the  observed 
elapsed  time  lies  in  the  range  [(1  —  6)T,  (1  -|-  (I)T],  where  S  is  the  drift  per  unit  time  (also 
referred  to  as  drift-rate). 

When  we  refer  to  internal  synchronization  within  bound  D,  we  imply  that  at  any  time 
instant  t,  the  clocks  of  two  internally  synchronized  nodes  u,  w  satisfy;  \Cu{t)  —  Cwit)\  <  D. 
When  we  refer  to  a  node  adjusting  its  clock,  we  imply  that  the  node  applies  a  correction  to 
its  clock  value. 
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In  his  seminal  paper  [69],  Lamport  proposed  that  a  key  goal  in  a  distributed  system 
should  be  to  ensure  that  causal  relationships  are  respected.  This  causality  could  be  captured 
in  a  happened-before  relation,  which  imposes  a  partial  order  on  system  events.  Thus,  a  ^  b 
implies  that  a  happened-before  b,  and  b  may  be  causally  affected  by  a.  Let  C{a)  denote  the 
time  observed  for  an  event  a  as  per  a  clock  C.  A  satisfactory  clock  C  must  then  satisfy  the 
following: 

Definition  4.  (Clock  Condition  [69])  For  any  events  a,b:  a  ^  b  C{a)  <  C{b). 

To  this  effect,  Lamport  logical  clocks  were  proposed  in  [69].  An  anomalous  scenario 
was  also  considered  whereby  out-of-system  message  exchanges  could  lead  to  violation  of  the 
Clock  Condition.  This  leads  to  the  consideration  of  a  Strong  Clock  Condition  whereby  causal 
ordering  is  preserved  even  taking  into  account  out-of-system  messages.  It  was  observed  in 
[69]  that  if  the  clock  drift  rate  5,  the  maximum  clock  skew  (or  synchronization  bound)  D 
and  the  minimum  message  transmission  time  Ti  satisfy  the  relation:  Ti  >  then  the 
system  of  physical  clocks  satisfies  the  Strong  Clock  Condition.  It  was  also  shown  that  a 
simple  synchronization  algorithm  suffices  to  ensure  that  clock  skew  is  bounded  by  a  suitable 
D. 

The  notion  of  leveraging  physical  clocks  rather  than  logical  clocks  has  wider  significance. 
Consider  a  system  where  some  processes  may  exhibit  Byzantine  behavior.  Then  their  logical 
clock  values  cannot  be  trusted,  as  they  may  affix  incorrect  logical  clock  values  to  messages 
they  send,  in  order  to  taint  the  logical  clocks  of  other  processes.  If  one  could  ensure  that  the 
physical  clocks  of  non-faulty  nodes  satisfy  certain  ordering  conditions,  this  could  be  quite 
beneficial.  A  similar  intuition  underlies  our  approach  towards  reliable  local  broadcast. 

10.3  Loose  Synchronization  and  Local  Broadcast 

In  this  section  we  describe  the  basic  assumptions  and  approach  behind  leveraging  the  ex¬ 
istence  of  loose  synchronization  to  facilitate  a  certain  ordering  condition  between  locally 
broadcast  messages.  In  Section  10.4,  we  discuss  how  the  ordering  condition  can  be  realized 
in  a  wireless  network,  and  subsequently  describe  in  Section  10.5  how  it  might  be  leveraged 
to  achieve  reliable  local  broadcast  with  probabilistic  guarantees. 

Consider  a  system  comprising  a  node  v  that  is  interested  in  sending  messages,  and  a  set  of 
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other  nodes  (neighbors  of  v)  capable  of  receiving  messages  from  v  over  a  shared  broadcast 
medium.  Each  node  is  equipped  with  a  single  half-duplex  transceiver.  Thus,  no  node 
can  send  and  receive  messages  simultaneously,  and  only  one  message  can  be  successfully 
transmitted  or  received  at  a  time  by  a  node.  Note  that  this  is  a  reasonable  model  for  wireless 
nodes  equipped  with  a  single  half-duplex  transceiver  and  an  omnidirectional  antenna,  which 
operate  on  a  single  common  channel. 

Receive-Timestamp  A  node  is  assumed  capable  of  noting  its  local  physical  clock  value 
just  after  its  physical  layer  hnishes  receiving  a  message  (this  is  also  a  reasonable  assumption; 
such  a  timestamping  operation  could  be  implemented  in  hardware).  This  is  termed  as  the 
receive-timestamp  observed  by  the  node  for  the  message. 

The  messages  sent  in  this  system  have  the  following  property: 

The  minimum  (absolute)  time  the  packet  transmission  occupies  the  channel  is  T;,  and 
the  actual  total  (absolute)  time  taken  by  a  message  in  transit  (between  the  time  the  sending 
node’s  physical  layer  starts  sending  the  message,  and  the  time  the  receiving  node  finishes 
receiving  and  notes  its  receive-timestamp)  is  upper-bounded  by  Tu-  Hence  T^  —  Ti  subsumes 
the  maximum  propagation  delay  and  upper  bounds  on  any  processing  delays  incurred  up 
to  the  time  of  taking  the  timestamp. 

Therefore,  the  (absolute)  time  T  taken  by  a  message  in  transit  from  sender  to  receiver 
(between  timestampings)  satisfies  Ti  <T  <  T^-  Note  that  this  condition  is  satisfied  by  all 
messages  including  those  sent  by  faulty  nodes.  We  explain  in  Section  10.4  why  this  is  a 
reasonable  assumption. 

We  define  the  following  condition: 

Definition  5.  (Receipt- Order  Condition)  If  a  node  v  sends  a  message  mi,  followed  by 
a  message  m2,  then  for  all  non-faulty  nodes  u,w  which  are  neighbors  of  v:  the  receive- 
timestamp  observed  by  u  for  m2  is  greater  than  the  receive-timestamp  observed  by  w  for 
mi- 

We  identify  two  situations  in  which  the  Receipt-Order  Condition  holds.  The  first  one 
relies  on  assumptions  about  external  clock  synchronization,  and  the  second  one  relies  on 
assumptions  about  internal  clock  synchronization. 
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Observation  1.  (Externally  Synchronized  Nodes)  If  the  physical  clocks  of  all  non- faulty 
nodes  in  the  system  are  externally  synchronized  within  bound  D,  and  if2Ti  —  T„  >  2D,  then 
the  local  physical  timestamps  observed  by  the  non-faulty  neighbors  of  v  for  messages  sent 
by  V  satisfy  the  Receipt- Order  Condition. 

Proof.  Suppose  the  sender  starts  sending  the  two  messages  mi,  m2  at  times  ti  and  t2  re¬ 
spectively  (according  to  the  ideal  external  clock).  Then  those  non-faulty  neighbors  of  v  that 
received  mi  would  have  received  it  within  the  interval  [ti  +Ti,ti+  Ty]  (as  per  the  external 
clock),  and  their  observed  receive-timestamp  would  lie  in  the  range  [til-Ti  —  D,  til-TyP D]. 
Similarly,  the  observed  receive-timestamp  for  the  second  message  m2  falls  within  [t2  -pTi  — 
D,t2  -\-  Ty  -\-  D].  Since  the  two  messages  are  sent  by  v,  using  its  half-duplex  transceiver, 
on  the  same  medium,  they  are  temporally  ordered  and  separated  in  time  i.e.  t2  >  ti  -\-  Ti. 
Thus,  {t2  -\-Ti  —  D)  —  (ti  -\-Ty-\-  D)  =  t2 —  ti  —Ty-\-Ti  —  2D  >  2T/  —  2D  —  >  0.  Therfore, 

any  non-faulty  node  that  receives  the  first  message  observes  a  receive-timestamp  that  is 
less  than  the  receive-timestamp  for  the  second  message  observed  by  those  non-faulty  nodes 
that  see  the  second  message.  Hence,  the  Receipt-Order  Condition  holds.  □ 

Observation  2.  (Internally  Synchronized  Nodes)  Consider  an  interval  of  time  in  the  sys¬ 
tem  in  which  no  non-faulty  node  adjusts  its  physical  clock,  the  physical  clocks  of  all  non- 
faulty  nodes  stay  internally  synchronized  within  bound  D,  and  drift-rate  is  upper-bounded 
by  5.  We  are  interested  in  messages  sent  and  received  entirely  during  this  interval.  If 
2Ti  —  Ty  —  5{2Ti  +  Ty)  >  D,  then  the  local  physical  timestamps  observed  by  the  non-faulty 
neighbors  of  v  for  messages  sent  by  v  satisfy  the  Receipt- Order  Condition. 

Proof.  The  argument  is  almost  the  same  as  that  used  in  [69]  to  argue  that  a  system  of 
physical  clocks  can  be  made  to  satisfy  the  Strong  Clock  Condition,  except  that  we  now 
apply  it  in  the  context  of  a  broadcast  medium  with  multiple  recipients  of  the  same  message. 

Denote  by  E),{m),  the  event  of  node  v  sending  message  m,  and  by  Cu{E),{m))  the  local 
physical  clock  time  at  some  non-faulty  node  u,  at  the  time  v  started  the  transmission.  Note 
that  this  does  not  imply  that  node  u  is  aware  of  the  instant  at  which  transmission  started,  u 
may  only  detect  the  transmission  after  some  minimum  propagation  delay.  Denote  by  Ej^^m), 
the  event  of  node  u  receiving  message  m,  and  by  Cy{Ey{m)),  the  receive-timestamp  observed 
by  node  u  for  a  message  m  received  by  it  (recall  that  receive  timestamps  are  recorded  when 
the  reception  has  finished). 
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Suppose  a  node  v  starts  sending  a  message  mi  at  a  time  when  local  time  at  some  non- 
faulty  neighbor  u  is  Cu{El{mi)).  Thus,  from  the  assumption  that  clocks  are  internally 
synchronized  within  bound  D,  the  local  time  at  any  other  non- faulty  neighbor  w  must  be 
<  Cu{E^{mi))  +  D,  and  w  will  observe  a  receive-timestamp  Cw{EJ^{mi))  < 
CUE^imi))  +  Tu{l  +  S)<  {Cu{El{mi))  +  D)  +  r„(l  +  5). 

If  V  later  starts  sending  a  message  m2  when  local-time  at  u  is  Cu{E^{m2)),  then 
Cu{E^{m2))  —  Cu{El{mi))  >  Ti{l  —  5).  Thus  the  receive-timestamp  u  observes  for  m2 
is  at  least  Cu{El^{m2))  >  Cu{E^{m2))  +  T/(l  -  5)  >  C'„(£'^(mi))  -|-  2r/(l  -  d).  Thus,  for  u 
and  any  other  non- faulty  node  tc:  C'n(-E^(m2))  >  Cu{Ev{mi))  +  2Ti{l  —  5)  =  {Cu{Ef,{mi))  + 
D  +  r„(l  +  ,5))  -  Tu{l  +  S)-D  +  2Ti{l-d)>  CUEUmi))  +  {2Ti{l  -5)-  T„(l  +  5)-D)  = 
C^(E;(mi))  +  i2Ti  -Tu-  6{2Ti  +  TJ  -  D)  >  C^(T;;(mi)). 

Thus  the  Receipt-Order  Condition  is  satisfied.  □ 

10.4  Network  Model 

Consider  a  wireless  multi-hop  network.  The  set  of  nodes  within  transmission  range  of  a 
node  V  is  termed  nbd{v).  u  is  a  member  of  nbd{v).  Let  nbd' {v)  =  nbd{v)  \  {u}. 

For  the  purpose  of  our  discussion,  we  focus  on  a  local  broadcast  domain  within  the 
wireless  network,  comprising  a  sender  node  s  and  nodes  within  its  transmission-range, 
denoted  by  nbd\s),  to  which  we  wish  to  ensure  reliable  local  broadcast  delivery.  We  denote 
\nbd'{s)\  by  d,  and  define  do  =  min  nbd'{x)  n  nbd' {s).  Thus  do  is  the  minimum  number 

x£nbd'{s) 

of  common  neighbors  of  s  and  any  of  its  neighbors. 

10.4.1  Fault  Model 

We  assume  the  locally  bounded  fault  model  of  Chapter  8,  wherein  an  adversary  may  place 
faults  so  long  as  the  number  of  faults  in  any  single  neighborhood  does  not  exceed  a  specified 
number  b.  Faulty  nodes  can  exhibit  Byzantine  behavior  at  higher  layers,  i.e.,  they  may 
change  the  values/semantics  of  messages.  However  all  PHY/MAC  layers  are  non-faulty 
and  faulty  nodes  do  not  deliberately  cause  collisions  or  spoof  MAC  addresses. 
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10.4.2  Communication  Model 


We  allow  for  an  unreliable  wireless  channel  where  fading  and  other  effects  may  lead  to  non¬ 
ideal  transmission  characteristics.  Accidental  collisions  and  interference  are  possible,  due 
to  an  imperfect  medium  access  mechanism.  If  a  node  transmits  a  message,  the  probability 
that  a  neighbor  successfully  receives  it  is  ps-  Packet  errors  due  to  fading,  or  accidental 
interference  etc.  are  subsumed  in  the  error  probability  (1— p^).  The  probability  of  successful 
reception  ps  is  assumed  independent  and  identical  for  each  transmission  and  each  receiving 
node.  A  desired  access  probability  0  <  pa  <  1,  and  an  accordingly  large  enough  timeout 
are  chosen,  such  that  if  a  packet  was  put  into  a  node’s  outgoing  queue  at  time  t,  then  by  time 
t  +  Ta,  the  packet  gets  a  chance  to  be  transmitted  by  this  node  and  received  by  neighbors 
with  probability  at  least  pa  (assumed  to  be  independent  of  other  nodes  for  simplicity). 
Both  Ps  and  pa  are  assumed  independent  of  d,  do-  Note  that  Tq  is  a  function  of  the  target 
access  probability  pa,  as  well  as  the  lengths  of  packet-queues  (and  hence  traffic- levels  in  the 
network) . 

All  nodes  possess  a  single  half-duplex  transceiver  with  an  omnidirectional  antenna,  and 
operate  on  a  single  channel.  They  also  use  a  single  transmission  rate  and  all  valid  messages 
are  of  a  predetermined  (and  equal)  size  (as  discussed  later,  this  can  be  chosen  to  facilitate 
reliable  local  broadcast).  Note  that  the  use  of  a  common  transmission  rate  r  bits/sec  and  a 
common  message  size  I  bits  ensures  that  all  messages  occupy  a  certain  minimum  time  Ti  >  ^ 
on  the  channel.  This  extends  to  messages  sent  by  faulty  nodes,  because  non- faulty  nodes  can 
choose  to  ignore  messages  that  do  not  conform  to  the  rate/size  specification  (information 
about  the  transmission-rate  of  the  message  can  be  obtained  from  the  recipient’s  physical 
layer),  giving  faulty  nodes  no  incentive  to  deviate  from  this  established  behavior. 

The  maximum  and  minimum  propagation  delays  are  dprop^'^^  and  dprop^^^  respectively 
(note  that  dprop^'’^  >  0).  Any  additional  delays  in  physical  layer  timestamping  are  upper- 
bounded  by  t delay,  yielding  a  maximum  delay  bound  of  Td  =  dprop^°‘^  +  t delay-  Thus 
Tu  =  Ti+  Td- 

For  the  rest  of  our  discussion,  we  assume  that  nodes  are  externally  synchronized  within 
bound  D-  Under  this  assumption,  we  may  leverage  Observation  1. 

^Even  in  a  multi-rate  wireless  network,  it  is  possible  to  stipulate  as  part  of  the  protocol  specification  that 
all  nodes  nse  a  specific  transmission  rate  (say  the  lowest  available)  for  critical  message  types  that  require 
reliable  dissemination. 
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We  seek  to  ensure  that  the  conditions  of  Observation  1  from  Section  10.3  are  satisfied. 
Thus,  we  want  2Ti  —  Ty,  =  Ti  —  >  D ,  or  Ti  >  D  +  Td-  Since  Td  is  independent  of  Ti,  this 

is  always  achievable^  (albeit  at  the  expense  of  inefficient  bandwidth  usage)  by  padding  all 
messages  with  extra  bits  to  achieve  the  desired  packetsize  I  (and  hence  T;)  for  the  specified 
transmission  rate  r.  Thus  the  Receipt-Order  Condition  can  be  made  to  hold. 

We  now  provide  a  brief  description  of  the  message  representation. 

In  order  to  distinguish  between  different  messages,  distinct  messages  sent  by  a  particular 
source  (originator)  are  distinguished  via  identifiers,  that  we  shall  denote  as  id.  The  id  is  a 
number  in  some  range  [0,  MAX\,  where  MAX  is  a  suitably  large  number.  Individual  nodes 
choose  the  sequence  of  ids  for  their  messages  in  some  privately  determined  pseudo-random 
manner  (such  that  ids  are  re-used  only  after  large  intervals  of  time;  thus  identifiers  may  be 
considered  unique  for  all  practical  purposes).  This  ensures  that  other  nodes  have  no  easy 
way  of  anticipating  what  the  sequence  of  id's  for  a  given  source  node  will  be. 

If  a  node  sends  two  conflicting  versions  of  the  same  message,  it  implies  that  they  both 
have  the  same  id,  but  different  values.  Original  messages  are  represented  as  m{src,  {id,  value)) 
Of  these,  the  src  field  is  obtained  from  the  MAC  header,  and  thus  contains  the  true  MAC 
address  of  the  node  that  put  the  packet  on  air,  since  by  assumption  MAC  addresses  are 
not  subject  to  spoofing.  The  {id,  value)  part  is  message-content.  If  a  message  m  is  relayed 
(repeated)  by  a  neighbor,  it  is  represented  as  REPEAT{relay .src,  {m,  timestamp)).  Once 
again,  relay. .src  is  the  MAC  address  of  the  relay  node,  obtained  from  the  MAC  header. 
The  {m,  timestamp)  part  is  message-content  (m  denotes  the  {src,  {id,  value))  information 
for  the  message;  however  as  this  is  now  part  of  message  content,  a  faulty  relay  node  can 
modify  the  src  information  if  it  so  chooses,  though  it  cannot  affect  the  correctness  of  the 
relay. src  field  in  the  MAC  header). 

10.5  The  Algorithm 

The  goal  of  the  algorithm  is  to  achieve  the  following  agreement  condition  with  probabilistic 
guarantees; 

Definition  6.  (Agreement  Condition)  If  a  local  broadcast  source  s  sends  a  message,  then  all 

^Even  if  there  is  some  dependence  between  Ti  and  Td,  it  may  still  be  possible  to  do  so,  e.g.,  if  Td  <  aTi+P 
where  0  <  a  <  1  and  (3  >  0  are  constants,  then  one  can  make  the  message  long  enough  so  that  Ti  > 
and  satisy  the  condition. 
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its  non- faulty  neighbors  should  agree  on  a  single  value  for  this  message.  If  s  is  non- faulty, 
this  agreed-upon  value  should  be  the  one  actually  sent  by  s.  If  s  is  faulty  and  sends  multiple 
conflicting  versions  of  the  message,  nodes  should  choose  the  first  value  that  s  sent. 

For  the  sake  of  simplicity  and  w.l.o.g.,  we  assume  that  the  message  m  may  take  one 
of  two  values  0  or  1.  The  algorithm  can  be  easily  generalized  to  more  than  two  message 
values. 

Suppose  we  have  sender  s.  Each  other  node  u  follows  the  following  algorithm: 

•  On  receipt  of  a  message  m{s,  {i,p))  from  s  directly  with  (local)  receive-timestamp  t: 
If  no  other  earlier  version  of  this  message  (i.e.,  of  the  form  m(s,  {i,q)))  was  received 
directly  from  s,  make  note  of  p  as  a  candidate  message  value,  and  re-broadcast  a  copy 
of  m  as  REPEAT{u,{m{s,i,p),f)).  If  an  earlier  version  of  the  same  message  was 
received  directly  from  s,  discard  this  message. 

•  On  receipt  of  a  message  REPEAT{v,  {m{s,i,p),tflj)\ 

If  no  previous  REPEAT{v,m{s,i,*),*)  ^  has  been  received,  make  note  of  p  as  a 
candidate  for  message-id  i  from  s,  reported  by  v  with  timestamp  Keep  track  of  all 
such  copies  of  m  received  via  REPEAT  messages  from  different  repeaters  along  with 
their  reported  timestamps. 

If  this  was  the  first  message  having  the  form  REPEAT{*,m{s,i,*),*)  received  by 
the  node,  start  a  timer  (tagged  by  (s,i))  to  expire  after  a  duration  T  -|-  T„  (where 
T  =  Ta  +  Tr,  Ta  being  the  pre-defined  access  timeout,  and  Tr  being  an  estimated 
upper  bound  on  processing  time  from  receiving  a  message  m  to  time  of  generating  a 
REPEAT  and  enqueueing  it  in  the  outgoing  packet  queue). 

•  On  expiration  of  the  timer  for  (s,f): 

Perform  the  following  filtration  and  majority-determination  procedure  on  the  received 
REPEAT  messages  containing  repeated  messages  of  the  form  m(s,  (i,  *)): 

Timestamp-based  filtration  and  majority  determination:  Let  us  refer  to  the  value  with 
highest  repeated  copy  count  as  ci ,  and  the  other  one  as  C2 .  If  the  number  of  copies  of 
C2  is  less  than  or  equal  to  b,  choose  ci  as  the  correct  value.  If  the  number  of  copies  of 
C2  is  greater  than  b:  discard  any  messages  with  value  ci  whose  timestamp  t  is  greater 
is  a  placeholder  for  any  value. 
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than  the  timestamps  of  more  than  b  copies  of  02-  Commit  to  the  majority  value  from 
amongst  the  remaining  copies  of  ci  and  02- 

Theorem  30.  Consider  a  local  broadcast  domain  in  the  wireless  network  eomprising  nbd{s) 
for  some  node  s.  Assume  that  the  physical  clocks  of  all  non-faulty  nodes  satisfy  the  Receipt- 
Order  Condition.  Let  a  be  a  constant  satisfying  a  <  PaPl  —  e,  where  e  >  0  is  a  con¬ 
stant.  If  at  most  b  nodes  in  any  single  neighborhood  are  faulty  (where  b  <  do), 

then  the  above  algorithm  ensures  that  all  non-faulty  neighbors  of  s  shall  be  able  to  achieve 
the  previously  described  agreement  condition  for  s ’s  message  with  error  probability  at  most 

(1 - ^)'^Papldo 

dexpf - ^2(i+a) - )’  'ddhich  is  small  if  do  is  large,  and  do  >>  Ind. 

Proof.  There  are  two  cases:  s  is  non-faulty  or  s  is  faulty: 

1.  s  is  non-faulty:  s  transmits  exactly  one  version  of  the  message  (call  it  mi  =  m(s,  (i,  gmi )))  • 
Since  any  u  G  nbd'{s)  has  at  most  h  faulty  nodes  in  nbd{u),  it  may  receive  up  to  a 
maximum  of  b  spurious  repeats  of  s’s  message.  If  the  number  of  REPEAT  copies  of 
the  message  received  from  non-faulty  nodes  (and  thus  containing  the  correct  value) 

is  greater  than  b,  this  suffices  to  distinguish  the  legitimate  value  from  a  spurious  one. 

2.  s  is  faulty:  If  s  is  faulty,  it  may  leverage  the  unreliability  of  the  channel,  and  attempt 
to  create  confusion  by  sending  more  than  one  version  of  the  message,  each  containing 
different  values.  We  show  that  despite  this,  under  the  assumed  conditions,  reliable 
broadcast  will  still  be  achieved. 

By  assumption,  the  physical  clocks  of  all  non-faulty  nodes  satisfy  the  Receipt-Order 
Condition.  Then,  in  the  algorithm  described  earlier,  copies  of  the  second  message  received 
from  non-faulty  neighbors  get  filtered  out  as  follows: 

Suppose  the  sender  s  sends  the  two  message- versions  mi  =  m{s,  {i,qmi))  and  m2  = 
m(s,  {i,qm2))  at  absolute  times  ti  and  t2  respectively. 

Hence,  any  non-faulty  node  that  receives  the  first  message  observes  a  receive-timestamp 
that  is  less  than  the  receive-timestamp  for  the  second  message  observed  by  those  non- 
faulty  nodes  that  receive  the  second  message.  All  non-faulty  nodes  attach  the  correct 
observed  timestamp  to  any  REPEAT  messages  they  send,  and  non-faulty  nodes  that  receive 
the  REPEAT  messages  record  the  timestamp  along  with  the  message  encapsulated  in  the 
REPEAT. 
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Recall  that  the  first  message- version  sent  out  by  s  is  mi  and  the  second  is  m2.  Also, 
the  message-version  with  highest  pre-filtration  count  is  referred  to  as  ci  and  the  other  one 
is  referred  to  as  C2. 

We  show  that  if  more  than  b  REPEAT  copies  of  mi  were  received  from  non- faulty  nodes, 
the  agreement  condition  is  achieved. 

Suppose  more  than  b  copies  of  mi  were  received  from  non-faulty  nodes,  i.e.,  more  than 
b  correct  copies  of  mi  were  received. 

Then  the  following  cases  may  arise: 

•  If  Cl  =  mi,  and  at  most  b  copies  of  m2  were  received: 

mi  will  win  the  majority  vote,  and  get  chosen  immediately. 

•  If  Cl  =  mi,  i.e.,  mi  has  the  highest  pre-filtration  count,  and  greater  than  b  copies  of 
m2  were  received: 

A  non-faulty  node  will  only  send  a  REPEAT  of  m2  if  it  receives  the  message  m2 
directly  from  s,  and  it  will  affix  a  correct  receive-timestamp  to  its  REPEAT.  Since  the 
Receipt-Order  Condition  holds,  the  timestamp  reported  in  any  such  REPEAT  copy  of 
m2  will  be  greater  than  the  timestamp  reported  in  any  of  the  correct  REPEAT  copies 
of  mi.  Thus,  no  more  than  b  copies  of  C2  =  m2  can  bear  a  false  earlier  timestamp. 
Resultantly,  no  copy  of  mi  sent  by  a  non-faulty  node  will  get  filtered  out  erroneously, 
and  mi  will  win  the  majority  vote. 

•  If  Cl  =  m2  i.e.  m2  has  the  highest  pre-filtration  count: 

Since  greater  than  b  copies  of  mi  were  received  from  non-faulty  nodes,  then  from  the 
Receipt-Order  Condition,  any  copy  {REPEAT)  of  m2  sent  by  a  non-faulty  node  has 
a  reported  timestamp  greater  than  the  reported  timestamps  on  the  greater-than-6 
correct  copies  of  mi,  and  the  timestamp  filtration  rule  ensures  that  all  copies  of  m2 
sent  by  non-faulty  nodes  get  filtered  out.  This  leaves  only  up  to  b  copies  of  m2  sent 
by  faulty  nodes.  Thus,  when  the  correct  REPEAT  copies  of  mi  are  greater  than  b, 
mi  will  win  the  majority  vote. 

Hence,  the  algorithm  definitely  makes  the  correct  decision  if  more  than  b  copies  of  mi 
were  received  from  non-faulty  nodes.  This  is  the  same  as  the  sufficient  condition  we  earlier 
stated  for  correct  decision  with  a  non-faulty  source. 
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When  b  or  fewer  copies  of  mi  are  received  from  non-faulty  nodes,  the  decision  may  be 
correct  or  wrong,  depending  on  how  many  copies  of  m2  were  received. 

To  bound  the  error  probability,  we  assume  the  worst,  i.e.,  it  is  always  wrong  if  b  or  fewer 
copies  of  mi  are  received  from  non-faulty  nodes. 

We  represent  the  copies  of  mi  repeated  by  non-faulty  nodes  that  were  received  by  a  node 
u  as  a  random  variable  Z.  Then,  the  requirement  is  that  Z  >  b  for  both  the  cases  (recall 
that  in  the  first  case,  the  source  is  non-faulty,  and  so  it  sends  only  one  message- version  mi, 
but  up  to  b  spurious  REPEAT  messages  containing  wrong  values  may  still  be  received  from 
faulty  nodes). 

Let  the  number  of  non-faulty  mutual  neighbors  of  s  and  u  be  g.  Then  g  >  do  —  b.  Z  is 
the  sum  of  g  i.i.d.  Bernoulli{papl)  random  variables,  since  a  repeated  copy  of  mi  is  received 
from  a  non-faulty  neighbor  if  that  neighbor  received  mi  directly  from  s  (probability  ps),  it 
was  able  to  transmit  the  REPEAT  packet  before  timeout  (probability  pa),  and  the  REPEAT 
was  successfully  received  by  u  (probability  ps).  This  allows  us  to  apply  the  following  special 
form  of  the  Chernoff  bound  [83] : 

Pr[Z  <  (1  -  P)E[Z]]  <  exp(^^^),  0  <  /3  <  1  (10.1) 

Knowing  that  b  <  <  ctg,  we  can  set  /3  =  1  —  to  obtain  6  <  (1  —  f3)E[Z].  Thus 

application  of  the  Chernoff  bound^  yields; 


Pr[Z  <  6]  <  Pr[Z  <  (1  -  P)E[Z]] 
(1- 


<  exp(— - 


PaPi 


"^Paplg  ^ 


(10.2) 


<  exp(— 


2(1+ a) 


Applying  the  union  bound  over  all  d  neighbors  of  sender  s,  probability  that  any  node  makes 

A--^PPapldo 

an  error  is  at  most  dexp( - ^2(i-i-a) - )’  '''^^ich  is  small  for  large  do,  and  do  >>  Ind.  □ 


Note  that,  as  d  increases,  the  timeout  component  Ta  would  typically  also  need  to  increase 
to  maintain  a  sufficiently  high  value  of  pa  (due  to  increased  contention  for  the  shared 

^Since  we  need  /I  >  0  for  application  of  the  Chernoff  Bound,  this  yields  the  constraint  that  a  <  PoPs  —  e 
with  £  >  0.  Thus  a  (which  gives  a  measure  of  the  proportion  of  tolerable  faults)  can  be  large  when  the 
probability  of  successful  receipt  (paPs)  is  large,  and  can  only  be  small  when  PaPs  is  small.  Also  note  that 
these  constants  determine  how  much  larger  do  should  be  compared  to  d  to  achieve  small  error  probability. 
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channel).  However,  in  most  cases  of  practical  interest,  d  will  not  be  unduly  large,  and  a 
moderate  value  for  T  can  suffice.  Besides,  the  protocol  is  still  fairly  scalable,  as  it  only 
requires  one  message  to  be  sent  by  each  node. 

In  our  analysis,  we  have  assumed  that  whenever  the  number  of  copies  of  mi  received 
from  non-faulty  nodes  is  less  than  b,  a  wrong  decision  is  made.  In  actuality,  if  the  number  of 
copies  of  mi  received  from  non-faulty  nodes  is  less  than  b,  there  may  still  be  situations  where 
a  correct  decision  may  be  made  (it  is  possible  that  the  total  number  of  received  copies  of 
m2  (from  faulty  or  non-faulty  nodes)  may  be  much  less  than  b,  since  these  transmissions  are 
also  subject  to  errors  in  reception).  Thus,  the  presented  analysis  establishes  a  conservative 
upper  bound  on  the  error  probability. 

10.6  Possible  Optimizations 

From  a  practical  perspective,  one  can  consider  many  possible  enhancements/optimizations 
to  the  basic  algorithm. 

1.  Each  node  can  be  made  to  retransmit  its  REPEAT  messages  k  times.  This  can  help 
improve  loss-resilience,  without  causing  duplication  problems,  since,  in  the  absence  of 
address  spoofing  (which  is  one  of  our  assumptions),  two  receipts  of  the  same  message 
are  easily  identified  by  the  repeater’s  address,  and  extra  copies  discarded. 

2.  One  could  consider  triggering  the  reliable  local  broadcast  algorithm  only  if  at  least 
one  warning  message  is  heard  from  a  node  claiming  to  have  heard  two  inconsistent 
messages  sent  by  s  (this  would  work  only  if  it  is  very  likely  that  a  fair  number  of 
nodes  will  receive  both  variants  of  s’s  message).  Also,  while  faulty  nodes  can  raise 
false  alarms,  that  is  no  worse  that  proactively  using  the  algorithm  each  time. 

10.7  Discussion  on  Synchronization  Requirements 

The  synchronization  assumptions  required  to  ensure  the  Receipt-Order  Condition  holds 
may  actually  be  practically  feasible  in  many  settings. 

One  can  envision  future  scenarios  where  wireless  nodes  may  be  equipped  with  on-chip 
atomic  clocks  [56]  with  very  low  drift.  Thus,  if  the  clocks  are  synchronized  with  an  external 
time  source  at  time  of  deployment,  then  one  might  bound  the  total  skew  over  the  entire 
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operational  lifetime  of  the  network,  and  this  would  not  be  overly  large.  Alternatively,  nodes 
might  be  GPS-equipped,  thus  providing  an  out-of-band  means  of  external  synchronization. 
In  such  scenarios,  the  conditions  for  Observation  1  can  be  made  to  hold. 

In  the  absence  of  on-chip  atomic  clocks  or  GPS-equipped  devices,  it  may  not  be  possible 
to  ensure  that  all  nodes  in  the  network  be  synchronized  to  an  external  clock  within  some 
constant  bound  D.  However,  it  may  still  be  quite  feasible  to  ensure  that  each  node  is 
internally  synchronized  within  constant  bound  D  with  its  two-hop  neighbors.  One  could 
envisage  a  situation  where  nodes  are  initially  synchronized  at  time  of  deployment,  and 
thereafter  periodically  run  a  re-synchronization  protocol,  to  ensure  that  any  any  two  nodes 
within  two-hops  of  each  other  always  stay  internally  synchronized  within  the  bound  D. 

A  lightweight  Byzantine  time  synchronization  protocol  might  possibly  suffice  for  this. 
In  the  period  between  two  consecutive  re-synchronizations,  the  conditions  of  Observation  2 
can  thus  be  made  to  hold  for  every  local  broadcast  domain  in  the  network. 

10.8  Using  the  Primitive  for  Multi-Hop  Broadcast 

We  briefly  discuss  how  the  proposed  primitive  could  potentially  be  used  as  a  building  block 
in  a  protocol  to  achieve  broadcast  in  a  multi-hop  setting.  As  was  mentioned  earlier,  the 
algorithm  we  described  in  Section  8.4  was  used  as  a  subroutine  in  the  bounded-collision- 
resilient  algorithm  of  [58] .  It  was  observed  in  [58]  that  this  algorithm  requires  neighbors  of 
the  original  sender  to  agree  on  the  value  it  sent,  even  if  the  original  sender  is  faulty;  for 
other  nodes  in  the  network,  correctness  of  the  algorithm  only  requires  that  neighbors  of  non- 
faulty  nodes  agree  on  the  messages  they  (the  non-faulty  nodes)  send,  and  this  property  was 
exploited.  It  follows  that,  if  one  is  using  a  global  broadcast  protocol  with  similar  properties, 
one  could  consider  using  the  reliable  local  broadcast  primitive  in  the  neighborhood  of  the 
original  sender,  and  merely  stipulate  that  other  nodes  retransmit  their  messages  a  sufficient 
number  of  times. 

Otherwise,  if  the  protocol  requires  that  neighbors  of  all  nodes  agree  on  what  they 
sent,  one  could  potentially  proceed  as  follows:  Let  us  consider  a  multi- hop  network  of 
n  nodes,  where  the  minimum  node  degree  is  dmin,  maximum  node  degree  is  dmax,  and 
do  =  min  min  \nhd'{x)  <^nbd' {y)\.  Thus  do  is  the  minimum  number  of  common  neighbors 

^  yGnbd'{x) 

shared  by  any  two  neighbors.  The  number  of  faulty  nodes  in  any  single  neighborhood  is  at 
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most  b  <  i^do  where  a  <  PaPs  ~  >  0).  Through  exchange  of  periodic  hello  messages, 

nodes  maintain  a  list  of  neighbors.  Neighbors  are  added/removed  only  if  more  than  a  certain 
number  of  HELLO  messages  have  been  consecutively  received/lost.  This  helps  maintain  a 
degree  of  stability  in  the  neighborhood  information,  in  the  face  of  short-term  signal  fluctu¬ 
ations.  Suppose  we  have  a  global  multi-hop  broadcast  protocol  that  assumes  reliable  local 
broadcast,  and  requires  a  total  of  0(n™')  messages  to  be  sent  (m  is  a  constant),  i.e.  has 
message  complexity  polynomial  in  n.  Then,  for  each  step  of  the  protocol  that  requires  a 
node  to  perform  a  local  broadcast,  the  reliable  local  broadcast  primitive  protocol  is  run  in 
the  local  broadcast  domain  comprising  the  node  and  its  neighbors.  Following  the  proof  argu¬ 
ment  of  Theorem  30,  we  can  obtain  that  the  probability  local  broadcast  is  achieved  reliably 

(1 - ^)^PaP^do  (1 - ^)^Pap‘ido 

is  at  least  1  -  dmax  exp( - ^  - - ^  In  d^ax)-  Since  n'^ 

such  successful  local  broadcasts  are  needed,  if  do  =  cimlogn  for  a  suitably  chosen  constant 
Cl  >  — 2 )  and  dmax  <  C2  log Ti  for  another  suitably  chosen  constant  C2  (note  that 

Pap'i  '  “  " 

C2  >  cim  by  definition),  then  by  applying  the  union  bound,  one  may  see  that  the  global 

{l-^^)^Pap1do 

broadcast  will  also  succeed  with  probability  at  least  1  —  n™'exp( - ^2(i-i-a) - \-^ndmax), 

which  approaches  1  for  large  n. 

The  tolerable  number  of  per-neighborhood  faults  would  be  given  by  the  minimum  of 
the  tolerance  threshold  for  the  global  protocol,  and  the  local  broadcast  primitive. 

10.9  Discussion 

The  algorithm  we  have  outlined  in  this  chapter  is  primarily  an  exploratory  proof-of-concept 
approach,  whereby  we  have  sought  to  highlight  the  potential  for  leveraging  the  shared  nature 
of  the  medium  in  conjunction  with  knowledge  of  physical  layer  characteristics  (in  this  case, 
the  transmission  rate),  and  other  information  from  lower-layers  (in  this  case,  timestamps), 
to  achieve  useful  message-ordering  conditions,  which  can  facilitate  the  design  of  scalable 
probabilistic  solutions  to  the  reliable  local  broadcast  problem,  and  possibly  other  reliable 
communication  primitives.  However,  there  are  still  numerous  outstanding  issues  that  need 
to  be  addressed. 

One  issue  is  that  of  using  a  suitable  Byzantine  time  synchronization  protocol  to  ensure 
internal  synchronization  between  neighboring  nodes  (see  Section  10.7).  It  might  be  possi¬ 
ble  to  leverage  existing  work  in  this  area,  e.g.,  [107].  Another  issue  is  that  one  might  wish 
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to  eliminate  the  requirement  in  Observation  2  that  during  the  interval  in  which  the  local 
broadcast  is  occurring,  nodes  do  not  adjust  their  clocks.  This  would  require  a  synchro¬ 
nization  algorithm  that  can  run  simultaneously  with  the  local  broadcast  algorithm  without 
affecting  the  Receipt-Order  Condition.  Additionally,  the  described  algorithm  assumes  i.i.d. 
loss  probabilities.  If  channel  losses  exhibit  spatial  correlation,  the  algorithm  may  need  to 
be  modified  to  handle  such  situations. 

A  major  shortcoming  of  the  algorithm  is  the  need  to  estimate  the  timeout  T  based 
on  access  probability  pa,  average  length  of  outgoing  packet-queues,  and  processing  time 
to  generate  a  REPEAT.  It  would  be  preferable  to  have  an  algorithm  where  nodes  decide 
to  invoke  the  filtration  and  majority  determination  procedure  based  on  some  event,  e.g., 
receipt  of  certain  messages. 

Many  of  the  assumptions  in  this  chapter  are  justified  by  assuming  a  network  with  a 
single  channel  and  omnidirectional  antennas.  Also  relevant  are  alternative  scenarios  where 
multiple  channels  or  beam-forming  antennas  are  available.  We  remark  that  usage  of  multiple 
channels  or  directional  antennas  tends  to  alter  the  broadcast  nature  of  the  wireless  medium, 
and  makes  the  network  look  increasingly  like  a  point-to-point  network.  Thus,  algorithms 
based  on  the  point-to-point  abstraction  may  increasingly  seem  suitable  in  such  scenarios. 

Furthermore,  as  mentioned  in  Section  7.3,  the  issue  of  handling  a  bounded  number  of 
collisions  in  a  grid  network  when  the  channel  is  reliable  was  addressed  in  [58] .  It  is  relevant 
to  consider  the  possibility  of  combining  ideas  from  [58]  with  some  of  the  ideas  discussed 
in  this  chapter,  to  handle  both  an  unreliable  channel  and  a  bounded  number  of  collisions. 
Other  possibilities  include  trying  to  exploit  the  availability  of  multiple  channels  (as  in  [27]), 
or  other  forms  of  physical  layer  diversity. 
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Chapter  11 

Conclusion 


In  this  dissertation  we  have  investigated  the  performance  of  wireless  networks  that  are 
subject  to  miscellaneous  forms  of  functional  constraints  or  malfunction.  As  wireless  net¬ 
works  proliferate  and  find  use  in  diverse  scenarios,  they  will  increasingly  need  to  operate  in 
the  presence  of  heterogeneous  (and  often  constrained)  hardware  capabilities.  Furthermore, 
fault-tolerant  communication  algorithms  will  be  required  to  provide  the  building  blocks  for 
reliable  operation  in  the  face  of  failure  and/or  disruption.  The  research  performed  as  part 
of  this  dissertation  has  contributed  to  developing  an  understanding  of  some  of  the  issues 
that  would  arise  in  such  scenarios. 

We  have  examined  the  routing  and  scheduling  implications  of  having  heterogeneous  ra¬ 
dios  with  constrained  switching  ability,  and  channels  with  heterogeneous  characteristics, 
through  theoretical  investigation.  The  asymptotic  capacity  results  in  Chapter  3  and  Chap¬ 
ter  4  quantify  the  impact  of  channel  switching  constraints,  and  also  provide  intuition  about 
the  implications  of  such  switching  constraints  for  load-balanced  routing  and  scheduling. 
The  results  in  Chapter  5  provide  insight  regarding  suitable  packet  scheduling  strategies  for 
networks  where  channels  can  have  diverse  rate  characteristics. 

The  channel  and  interface  management  protocol  described  in  Chapter  6  provides  a  proof- 
of-concept  of  the  possibility  of  evolving  a  generalized  conceptual  design  approach  toward 
handling  various  kinds  of  physical  layer  heterogeneity. 

The  broadcast  results  in  Chapter  8  and  Chapter  9  establish  fundamental  limits  on  fault- 
tolerance  and  also  provide  insight  into  the  potential  for  exploiting  the  broadcast  nature  of 
the  wireless  medium  for  reliable  communication. 

Some  of  the  theoretical  results  that  are  part  of  this  dissertation  have  also  served  as 
building  blocks  for  other  work.  The  asymptotic  capacity  results  for  random  (c,  /)  assign¬ 
ment  that  were  described  in  Chapter  4  have  been  used  to  obtain  asymptotic  capacity  results 
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with  random  key  pre-distribution  in  [11],  The  algorithm  for  broadcast  with  locally-bounded 
faults  is  used  in  [58]  as  a  subroutine  in  a  broadcast  algorithm  that  is  resilient  to  an  adversary 
that  can  cause  a  bounded  number  of  collisions. 

We  have  also  identified  and  discussed  many  interesting  directions  for  future  work  build¬ 
ing  upon  this  research,  both  in  terms  of  theory  and  protocol  design. 
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Appendix  A 

Notation  and  Terminology 


Throughout  the  text  of  this  dissertation,  we  have  used  the  following  standard  asymptotic 
notation: 

•  /(n)  =  0{g{n))  means  that  3  c  >  0,  Aq  >  0,  such  that  /(n)  <  cg{n)  for  all  n  >  Ng 

•  /(n)  =  o(g(n))  means  that  lim  =  0 

•  /(n)  =  uj{g{n))  means  that  g{n)  =  o{f{n)) 

•  f{n)  =  VL{g{n))  means  that  g{n)  =  0{f{n)) 

•  f{n)  =  Q{g{n))  means  that  3  ci  >  0,C2  >  0,Ao  >  0,  such  that  cig{n)  <  f{n)  < 
C2g{n)  for  all  n  >  Ng 

When  /(n)  =  0{g{n)),  any  function  h{n)  =  0(/(n))  is  also  0{g{n)).  We  often  refer  to 
such  a  situation  as  h{n)  =  0(/(n))  0{g{n)). 

Whenever  we  use  the  notation  “log”  without  explicitly  specifying  the  base,  we  imply 
the  natural  logarithm.  We  also  use  the  notation  “In”  for  the  natural  logarithm  in  many 
proofs.  We  explicitly  specify  the  base  whenever  it  is  other  than  e  (the  base  of  the  natural 
logarithm) . 

When  we  use  the  term  w.h.p.  (with  high  probability),  we  imply  with  probability  that 
tends  to  1  as  n  tends  to  oo  (where  n  is  as  defined  in  the  specific  context). 
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Appendix  B 

Proofs  of  Connectivity  Results 


The  necessary  conditions  for  connectivity  with  adjacent  (c,  /)  assignment  and  random  (c,  /) 
assignment  are  both  obtained  by  an  adaptation  of  the  proof  techniques  used  in  [42]  to  obtain 
the  necessary  condition  for  connectivity.  The  major  difference  stems  from  the  fact  that  in 
the  presence  of  switching  constraints,  two  nodes  may  be  within  range  and  yet  be  unable  to 
communicate  with  each  other  (if  they  cannot  switch  (operate)  on  any  common  channel). 
The  following  lemma  which  was  stated  and  proved  in  [42]  will  be  used  in  our  proofs. 

Lemma  48.  (i)  For  any  p  G  [0, 1] 


{I  —  p)  <  e~^ 

(ii)  For  any  given  6  >  1,  there  exists  po  G  [0, 1],  such  that 

g-ep  <  (1  _  p)^  VO  <  p  <  Po 


If  9  >  1,  then  po  >  0. 


Proof.  See  Lemma  2.1  in  [42].  □ 

Lemma  49.  If  7rr^(n)  =  ?  then,  for  any  fixed  6  <1: 

n(l  —  p7rr^(n))*''^“^^  >  9e~^  (B-1) 


for  sufficiently  large  n. 

Proof.  This  is  basically  the  proof  of  Lemma  2.2  from  [42],  as  presented  in  [42],  with  the 
minor  change  that  7rr^(n)  is  replaced  with  p7rr^(n).  Taking  the  log  of  the  L.H.S.  and  using 
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the  Taylor  Series  expansion,  we  have: 


logL.H.S.  =  logn  +  (n  —  1)  log  (1  —  pirr^iji)) 

°°  ■.{p'Kr‘^{n)y 


=  logn  —  [n  — 


2=1 


=  logn  -  (n-  1)  +  +e(n) 


\i=l 


pTTr^{n)y  (logn  +  by 


where  6(n)  =  Y,  ^  ”  =  E 
*  i=3 


i=3 

oo 

^1  r  /  log  n  +  6 

-sj  V  n 

i=2 


^  \  ( log  n  +  b 


dx 


3  \  n 
for  large  n 


From  the  above,  we  obtain: 


1  tuq^^  (  ^./logn  +  6  5(logn  +  6)2 

XogL.H.b.  >  logn  —  (n  —  1)  ( - h 


n 


Qv? 


>  -h-  0og^  +  ^)^  ~  (log^  +  fe)  >  ^ 


n 


Setting  5  =  In  i,  and  taking  exponents  on  both  sides  yields  that  the  L.H.S.  >  6e  ^  for  large 


n. 


□ 


B.l  Adjacent  (c, /)  Assignment:  Proof  of  Theorem  1 

Given  that  a  node  has  block  location  i,  the  probability  that  it  can  operate  on  a  common 
channel  with  another  node  (we  shall  often  refer  to  this  as  sharing  a  channel)  within  its 
range  is  given  in  (3.3),  and  denoted  by  Padjii)- 

Note  that  Padji^)  is  different  for  different  block  locations  i  primarily  because  nodes  with 
channel-blocks  at  the  fringes  of  the  band  are  less  likely  to  share  channels  with  other  nodes. 
Since  we  are  deriving  a  necessary  condition  for  connectivity,  it  is  possible  to  make  the 
following  assumption  for  the  purpose  of  this  proof: 

Channel  pairs  (i,c  —  /  -|-i  +  l),l  <  i  <  /  —  I  possess  magical  capabilities,  such  that 
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communication  on  channel  i  ends  up  being  visible  on  channel  c  —  /  +  i  +  l,and  vice-versa. 
Thus,  if  a  node  has  channel  i,  then  it  can  also  communicate  with  a  node  that  does  not 
share  any  channel  with  it,  but  has  channel  c  —  f  +  Another  way  to  view  this  situation 
is  that  although  nodes  are  assigned  channels  as  per  the  adjacent  (c,  /)  model,  at  time  of 
network  operation,  a  node  having  channel  c  —  f  +  i  +  l,l  <  i  <  /  —  I  uses  channel  i  instead 
(i.e.,  c  —  f  +  i  +  1  serves  as  an  alias  for  i). 

Under  this  assumption,  Padjii)  =  min{  ,  1},  for  all  i.  If  the  network  is  disconnected 
under  this  assumption,  then  it  must  necessarily  be  so  otherwise.  This  can  be  seen  thus: 
suppose  we  are  given  a  network  instance  with  nodes  assigned  adjacent  channels  as  per  the 
adjacent  (c,  /)  model,  and  we  then  impose  the  assumption  stated  above.  Suppose  this 
network  is  disconnected.  Now  the  imposed  assumption  is  removed,  but  the  channel  block 
assigned  to  each  node  remains  unchanged.  Then,  in  the  new  scenario,  some  nodes  that 
were  earlier  able  to  communicate,  will  not  be  able  to  do  so  anymore;  however  those  nodes 
that  were  incapable  of  communicating  will  preserve  their  status  quo.  Thus,  a  necessary 
condition  for  the  hypothetical  network  is  also  valid  for  the  actual  network. 

Therefore,  to  establish  a  necessary  condition  for  connectivity  with  adjacent  (c,  /)  as¬ 
signment,  we  estabslish  a  necessary  condition  for  connectivity  in  a  scenario  where  we  have 
the  additional  assumption  described  above.  This  proof  is  an  adaptation  of  a  similar  proof 
in  [42]  (Theorem  2.1  in  [42]. 

We  focus  on  the  disconnection  event  where  singleton  sets  are  partitioned  from  the  rest 
of  network.  Recall  that  p  =  min{  ,  !}•  When  /  >  then  p  =  1,  i.e.,  any  pair  of 
nodes  that  are  within  range  can  communicate  with  each  other,  and  the  necessary  condition 
result  from  [42]  applies  directly.  Hence,  we  consider  only  the  scenario  /  <  for  which 
p  =  c-f+i  •  ^Iso  note  that: 

21ogn  2Q;log^(n) 

<  -  <  -  where  a  is  a  constant 

pn  n 

{'■'  P  >  - r — 7  >  “  and  c  <  ologn  for  some  constant  a  (B.2) 

c- f+1  c 

and  b(n)  <  logn  for  large  n  limsup  b(n)  <  -|-oo) 

n— >00 

The  probability  that  a  node  x  is  isolated,  i.e.,  cannot  communicate  with  any  other  node, 
is  given  by  pi  =  (1  — p7rr^(n))^"'“^\  Consider  the  event  that  nodes  x  and  y  are  both  isolated. 
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(1)  (2)  (3) 


Figure  B.l:  Three  Cases:  Necessary  Condition  for  Connectivity 

There  are  three  different  cases  for  this  (also  see  Fig.  B.l): 

1.  X  and  y  lie  within  distance  r(n)  of  each  other,  but  do  not  share  a  common  channel 

2.  X  and  y  do  not  lie  within  distance  r(n)  of  each  other,  but  have  overlapping  neighbor¬ 
hood  regions,  i.e.,  they  lie  within  a  distance  2r(n)  of  each  other 

3.  The  neighborhood  regions  of  x  and  y  are  disjoint,  i.e.,  the  distance  between  x  and  y 
is  greater  than  2r{n). 

The  probability  that  both  x  and  y  are  isolated  is  given  by  the  probability  that  they  can¬ 
not  communicate  with  each  other,  and  none  of  the  remaining  n  —  2  nodes  can  communicate 
with  either  of  them. 

From  the  geometry  of  the  situation  (Fig.  B.2),  it  follows  that  if  x  and  y  are  separated 
by  a  distance  d{n)  then  the  overlap  area  between  the  neighborhoods  of  x  and  y  =  2  [(area 
of  quadrant  subtending  angle  26)  —  (  area  of  AABC)]  =  2r‘^{n)6  —  r‘^{n)sin{26),  where 

G  =  {§§))■ 

Let  us  first  consider  case  1,  i.e.,  the  distance  between  x  and  y  is  d{n)  <  r(n).  We  view 
it  as  two  sub-cases  (noting  that  \og  ^  ^  ^  large  n): 

•  (i)  y  is  at  distance  d{n)  <  r'{n)  =  ® 

•  (ii)  y  is  at  distance  d{n)  >  r'{n)  =  ^  \og ^  ’'(^)  of  x 

The  probability  that  a  node  z  ^  x,y  within  range  of  both  x  and  y  is  capable  of  communi¬ 
cating  with  at  least  one  of  x  and  y,  given  that  x,  y  cannot  communicate  with  each  other  is 
q  >  ^  Also,  when  /  <  then  3/  —  l<c  —  /  -|-1,  and  q  >  —  ¥  • 

For  sub-case  (i)  of  case  (1),  the  overlap  area  between  the  neighborhoods  of  x  and  y 
is  at  least  (1  —  5)TTr‘^{n)  for  any  6  >  0  and  large  enough  n,  since  the  separation  d{n)  < 
( ^(^)-  For  our  purpose,  it  suffices  to  take  =  |,  yielding  an  overlap  area  of  at 
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Figure  B.2:  Overlap  Area  of  Neighborhoods 


least  .  Then  the  probability  that  a  node  can  communicate  with  either  x  ov  y  or  both 

is  at  least  q  times  the  probability  of  lying  in  the  overlap  area. 

Thus,  the  contribution  of  subcase  (i)  of  case  (1)  to  the  probability  that  both  x  and  y 
are  isolated  can  be  upper-bounded  as  follows: 

When  /  <  ^  (implying  q  >  ^): 


P21(i)  <  Jrr'^{n)(l  -p)  (l 
<  ( 1  - 


47rr^(n) 


<  TTr^{n)  (  1  — 


5 

n—2 


n—2 


n—2 


6p7rr^(n) 

5 

<  from  Lemma  48 


< 


2alog^n  (n  /-n  r,^ 

- ^ — e  ^  >  5u  from  (B.2) 

n 


^  &(logn  +  b(n))  ^  12(lcs  n+b(n))  ^ 


^  nip.  _  6^  l2(log^+b(«))  2a-r2  log  log  n 

21  log  n  ui'^\ 

<  e  10  ^  ^  for  large  n 

<  g-21ogn-fe(n)-|  loglogn  ^ 


(B.3) 


When  /  >  p  =  min{  ^-f+i  >  1}  ^  ^  2.  For  this  situation,  we  merely  consider  the 

probability  that  one  of  the  remaining  n  —  2  nodes  can  communicate  with  one  of  x  and  y 


253 


Figure  B.3;  First  Case:  Necessary  Condition  for  Connectivity 
(say  x)  to  obtain  the  upper  bound  on  both  x  and  y  being  isolated: 

P2i(i)  <  7rr'2(n)(l  -p)(l  -P7rr2(n))”“^ 

/256(loglogn)2\  2!  Nn  2/  XNn-2 

<  - 5 -  7rr  (n)(l  —  pvrr  (nj) 

V  log^  n  ) 

<  ( 256(loglog^\  ^^2(^)g-(n-2)p^r2(n)  Lemma  48 

V  log^  n  ) 

/256(loglogn)^\  /logn  +  6(n)\  ^_(^_2)(i°9"+K»))  (3  4) 

“  V  log^n  y  V  J 

^  /256(loglogn)^(2(21ogn))\ 

~  V  nlog^(n)  )  ^  ~  2 

^  —  log  71— fc(n)+  +log  256+log  4— log  n— log  log  n+2  log  log  log  n 

<  g-21ogn-b(n)-iloglogn  ^ 

From  B.3  and  B.4,  for  all  valid  /: 

P2i(i)  <  for  large  n  (B.5) 


For  sub-case  (ii),  the  situation  is  depicted  in  Fig.  B.3.  The  probability  that  some  node  can 
communicate  with  at  least  one  of  x  or  y  is  lower  bounded  by  the  probability  that  it  lies  in 
range  of  x  (this  probability  is  7rr^(n))  and  shares  a  channel  with  it  (this  probability  is  p), 
or  it  lies  out  of  range  of  x  but  within  range  of  y  (this  probability  is  at  least  ("■)  fQj- 
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large  enough  n)^,  and  shares  a  channel  with  y  (this  probability  is  p).  The  contribution  to 
the  probability  that  both  x  and  y  are  isolated  is  thus  at  most: 


P2i{ii)  <  (7rr^(n)  -7rr'^(n))  {1  -  p)  il-piirr'^in)  + 


n— 2 


<  7rr^(n)  —  p  ^7rr^(n)  + 

<  7rr^(n)  f  1  —  p7rr^(n)  f  1  + 


\/3r(n)r'(n) 

2 

\/3r'{n) 


n—2 


n—2 


27rr(n) 


<  7rr^(n)  1  —  p7rr^(n)  1  + 


8\/31og  logn 
vrlogn 


n—2 


<  Trr^(n)e  logn  )  from  Lemma  48  rr  <  2\/3)) 

2o1oe^\  ^_,„_e|p,r»(„)(i+liegp) 

n  J 


^  (n-2)p7rr2  (n)  (IH - )+log  2a+2  log  log  n-log  n 

2 (log  n+6(n))(l+  ^  ) 

^  log  n—  b(n) — 4  log  log  n-\ - - - - |_log  2q;+2  log  log  n— log  n 


(B.6) 


<  g-2 log n-fe(n)-log logn 


n 


For  case  2,  the  probability  that  some  node  can  communicate  with  either  x  or  y  can  be 
lower  bounded  by  the  probability  that  it  lies  in  range  of  x  (this  probability  is  '7rr^(n))  and 
shares  a  channel  with  it  (this  probability  is  p),  or  it  lies  out  of  range  of  x  but  within  range 
of  y  (the  disjunction  of  the  two  circles  in  Fig.  B.2  is  at  least  ^7rr^(n)  for  this  case),  and 
shares  a  channel  with  it.  Thus  the  contribution  of  this  case  to  the  probability  that  both  x 
and  y  are  isolated  is  upper  bounded  by: 

3 

P22  =  (47rr^(n)  —  7rr^(n))(l  —  -p7rr^(n))"'“^ 

2  3(n  — 2)p7rr^(n) 

<  Svrr  (n)e  2  from  Lemma  48 

^  g— I  logn— I  b(n)+5fl2S21±M!d)_|_iog6a+2  log  logn— logn 

^The  area  within  range  of  y  but  out  of  range  of  x  is  given  by  7rr^  (n)  —  overlap  area  ;  where  overlap  area  = 
2  (area  of  quadrant  subtending  angle  26  —  area  of  A  ABC)  <  Tvr^{n)  —  r^{n)  sin(20).  Note  that  f  <  ^  <  f . 
Thus  the  non-overlap  area  >  r^(n)  sin(26')  =  r^(n)(2sin6'cos^)  =  r^{n)2  sin  6  2r"n)  —  2,r^{n)  (sin  ^)  > 

\/3r{n)r'  (n) 

2 
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<e  2 for  large  n  (B.7) 

For  case  3,  the  probability  that  both  x  and  y  are  isolated  is  upper  bounded  by: 

P23  =  (1  -  47rr^)(l  -  p(27rr^(n)))'^“^ 

<  (1  -  2p7rr2(n))"-2 

(B.8) 

<  e-2(^-2W"(«)  from  Lemma  48 

<  g-21og(n)-2fe(n)+"(‘°g"+^("» 

Then,  the  probability  p2  that  nodes  x  and  y  are  both  isolated  is  given  by: 

P2  <  P2l{i)  +  P21(ii)  +  P22  +  P23  (B.9) 

Let  us  first  consider  the  case  where  h{n)  =  6  is  a  constant. 


Pr[  disconnection  ]  >  Pr[x  is  only  isolated  node] 

X 

>  ^^Pr[x  isolated  ]  —  Pr[x  and  y  both  isolated  ] 

X  x,y^x 

=  npi  —  n{n  —  l)p2 

>  n(l  -  p7rr^(n))(’'“B  _  -  1)  {p2i{i)  +  P2i(ii)  +  P22  +  P23) 

>  Oe-^  -  n{n  -  1) 

_|_g-21ogn-6-loglogn  _|_  g-|  logn-b  ^-2 log n-2fe+LL£Zl±tl ^ 

>  Oe-’’  -  (1  +  e)e-2^ 

for  any  0  <  1,  e  >  0,  and  large  n  (Lemma  48,  Lemma  49) 

(B.IO) 

Now  consider  the  case  where  b{n)  is  not  constant,  and  limsup  b{n)  =  b.  Then,  for 

n— >00 

any  e  >  0,  b{n)  —  b  <  e  for  large  n.  Since  the  probability  of  disconnection  monotonically 
decreases  in  6(n),  we  can  take  the  following  bound: 


Pr [disconnection]  >  9e  —  (1  +  e)e 

(  for  large  enough  n) 


(B.ll) 
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Since  (B.IO)  and  (B.ll)  hold  for  all  any  0  <  1,  e  >  0  and  large  enough  n,  it  follows  that 
when  limsup  b{n)  <  +oo,  the  network  is  asymptotically  disconnected  with  some  positive 

n— >oo 

probability. 


B.2  Random  (c, /)  Assignment:  Proof  of  Theorem  4 


From  the  model  definition,  the  probability  that  two  nodes  in  range  of  each  other  can  operate 
on  a  common  channel  (we  will  often  refer  to  this  as  sharing  a  channel)  is  p  =  Pmd  where 
1  -  Prnd  =  (1  -  {)(!  -  -  S3/+t)-  ^°te  that  for  /  >  §,  p  =  pmd  =  1,  as  any  two 

nodes  are  guaranteed  to  have  at  least  one  common  channel.  Then  the  necessary  condition 
for  connectivity  proved  in  [42]  is  applicable.  Therefore,  we  will  only  consider  the  case  /  <  §. 

The  probability  that  a  node  x  is  isolated,  i.e.,  cannot  communicate  with  any  other  node 
is  give  by  Pi  =  (1  —  p7rr^(n))*^"'“^^. 

We  begin  by  making  the  following  observations: 


P  —  Prnd 


/ 

C 


2,  .  ^  2clogn  2a  log  (n) 
r  yn)  <  - - -  <  - 


for  some  constant  a 


c  =  O(logn)  c  <  a  logn  for  some  constant  a  and  large  enough  n 
and  h{n)  <  logn  for  large  n  limsup  6(n)  =  b  <  +oo 

n^oo 


(B.12) 


(B.13) 


Consider  the  event  that  two  nodes  x  and  y  are  both  isolated.  There  are  three  different 
cases  for  this  (Fig.  B.l): 

1.  X  and  y  lie  within  distance  r(n)  of  each  other,  but  do  not  share  a  common  channel 

2.  X  and  y  do  not  lie  within  distance  r(n)  of  each  other,  but  have  overlapping  neighbor¬ 
hood  regions,  i.e.  lie  within  distance  2r{n)  of  each  other 

3.  The  neighborhood  regions  of  x  and  y  are  disjoint,  i.e.,  the  distance  between  them  is 
greater  than  2r(n). 

From  the  geometry  of  the  situation  (Fig.  B.2),  it  follows  that  if  x  and  y  are  separated 
by  a  distance  d{n)  then  the  overlap  area  between  the  neighborhoods  of  x  and  y  =  2  (area 
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of  quadrant  subtending  angle  29  —  area  of  AABC)  =  2r‘^{n)6  —  r‘^{n)sin{29)  <  7rr^(n)  — 
r‘^{n)  sin(20),  where  6  =  cos“^ 

Of  these,  for  case  (1),  consider  two  sub-cases: 

•  (i)  y  is  at  distance  d{n)  <  r'{n)  =  from  x 

•  (ii)  y  is  at  distance  d{n)  >  r'(n)  =  ^ \og fro™  ^ 

The  probability  that  a  node  z  x,y  within  range  of  both  x  and  y  is  capable  of  commu¬ 
nicating  with  at  least  one  of  x  and  y,  given  that  they  do  not  have  a  common  channel  of 
operation,  is  given  by  (7  =  1  —  (1  —  ^)(1  —  —  c-f+i )  —  P  (^ooall  that  we  are  only 

considering  /  <  §). 

When  /  >  it  is  evident  that  q  =  I  >p.  When  /  < 


l-p  (1  -  1)(1  -  -  sTfi) 


1-9 


2/ 


l±}' 

2/ 


/-HI' 


=  1  + 


l-¥ 


1  + 


_1_ 

c—1 


I-IL 

c-l  . 


1  + 


c-f+l 


1  - 


2/ 

C-/-H1  , 


>  1  + 


/  JL 

c  I  c-l 


1  - 


2/ 


1-^ 

C—1 


+  ...  -h 


/ 

C-/-H1 


1  - 


2/ 

C-/  +  1 


/ 


f  f 

>  1  +  T  +  ^ - ^  ...  + 

-  cc-1  C-/+1 

f 

>1  +  — 
c 


(B.14) 


Hence: 


^1  ^~P  I  n  1  ^~P 

q>l-  - - =P+A-P)- 


1  + 


=  p+{l-p)  [l- 


1  +  V 

=p|i+^^ 


i+v 


2/2 


■^ + 1 


from  Lemma  14  and  the  fact  that  p  =  Pmd 


(B.15) 

For  sub-case  (i)  of  case  (1),  the  overlap  area  between  the  neighborhoods  of  x  and  y  is  at 
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least  (1  —  (5)7rr^(n)  for  any  (5  >  0  and  large  enough  n,  since  the  separation  d{n)  <  r'{n)  = 
( For  our  purpose,  it  suffices  to  take  (5  =  yielding  an  overlap  area  of 
at  least  Then  the  probability  that  a  node  can  communicate  with  either  x  or  y  or 

both  is  at  least  q  times  the  probability  of  lying  in  the  overlap  area. 

When  I  <  ;  then  from  (B.15): 


q>p  1  + 


C 

2P 


(1-^) 


(1  +  ?) 


/' 


(B.16) 


^  f  f  ^  (loglogn) 

\  6  J  6  c  log  n 


and  large  n 


Resultantly,  the  contribution  of  subcase  (i)  of  case  (1)  to  the  probability  that  both  x  and 
y  are  isolated  can  be  upper-bounded  as: 


P2i{i)  <  vrr'  (n)(l  -p)(l  -  q 


15vrr2(n)  _2 


16 

<  7rr2(n)(l  - 

5 

<  7rr^(n)(l  —  -p7rr^(n)) 


n-2  ^  (loglogn)^ 


c  logn 


<  7rr^(n)e  from  Lemma  48 

^  j^2«log^(n)^ 


<  g- 1  log  ri-jb+  5(l°g"+i')  _iog  n-Hog  2a-r2  log  logn 

<  e~^  iog"^-4^>  fQ];.  large  n 

<  g-21ogn-fe(n)-|  logloglogn 


(B.17) 


For  sub-case  (i)  of  case  (1),  when  ^  lower  bound  the  probability  of  a 

node  being  able  to  communicate  with  either  of  x  and  y  by  the  probability  that  it  is  able  to 
communicate  with  one  of  them  (say  x).  Thus  the  probability  that  both  x  and  y  are  isolated 
is  at  most: 


259 


For  sub-case  (ii),  the  situation  is  depicted  in  Fig.  B.3.  The  probability  that  some  node 
can  talk  to  either  x  or  y  is  lower  bounded  by  the  probability  that  it  lies  in  range  of  x  (this 
probability  is  7rr^(n))  and  shares  a  channel  with  it  (the  probability  of  sharing  a  channel  is 
p),  or  it  lies  out  of  range  of  x  but  within  range  of  y  (at  least  (^)  fQ]-  large  enough 

n)^,  and  shares  a  channel  with  y  (once  again  this  probability  is  p).  The  probability  that 

^The  area  within  range  of  y  but  out  of  range  of  x  is  given  by  7rr^  (n)  —  overlap  area  ;  where  overlap  area  = 
2  (area  of  quadrant  subtending  angle  26  —  area  of  AABC)  <  7rr^(n)  —  r^(n)  sin(26).  Note  that  j  <  6  <  ^ 
for  sub-case  (ii).  Thus,  the  non-overlap  area  >  r^(n)sin(20)  =  r^(n) (2  sin 6^008  0)  =  (n) (2  sin  9) ( ^ 

2r^(n)(smf)(^)> 
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both  X  and  y  are  isolated  can  thus  be  upper  bounded  as: 


P2iiii)  <  (vrr^(n)  -  7rr'^(n))(l  -  p)  1  -  p  7rH(n)  + 


\/3r(n)r'(n) 


n—2 


<  7rr‘^{n)  1  —  p  7rr‘^{n)  + 


<  TTr‘^{n)  1  —  p7rr^(n)  1  + 


<  TTr^{n)  1  —  p7rr^(n)  1  + 


\/3r(n)r'(n) 

2 

\/3r\n) 


n—2 


n—2 


27rr(n) 


8\/31og  logn 
vrlogn 


n—2 


<  iTr^(n)e  losn  )  from  Lemma  48  7r  <  2\/3}) 

^  /2e.log^n^ 


<  e 


-  (n- 2)p7rr 2  (n)  ( 1  +  ) +log  2a+2  log  log  n- log  n 


2(log  n+b(n))(l+  ^  ^  ) 

^  logn— b(n)— 4  log  log  n-\ - - - - hlog  2a+2  log  log  n— log  n 


(B.20) 


<  g-2 log n-b(n)-log logn 


n 


For  case  2,  the  probability  that  some  node  can  communicate  with  either  x  or  y  is  lower 
bounded  by  the  probability  that  it  lies  in  range  of  x  (  which  is  7rr^(n))  and  shares  a  channel 
with  it  (which  is  p),  or  it  lies  out  of  range  of  x  but  within  range  of  y  (the  disjunction  of 
the  two  circles  in  Fig.  B.l  (2)  has  area  at  least  ^'jrr‘^{n)),  and  shares  a  channel  with  it. 
Thus  the  contribution  of  this  case  to  the  probability  that  both  x  and  y  are  isolated  is  upper 
bounded  by: 


P22  <  (47rr^(n)  —  7rr^(n))  ^1  —  p7rr^(n)  —  -p7rr^(n)^ 

<  (47rr^(n)  —  7rr^(n))  ^1  —  ^p7rr^(n)^ 

<  37rr^(n)e“  from  Lemma  48 

n  / 


n—2 


—  I  log  n—  |fe(n)+ 51l2S21±thl))  —log  n+log  6«+2  log  log  n 

<  for  large  n 


<  e  2 
9 


(B.21) 
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The  contribution  of  case  3  to  the  probability  that  both  x  and  y  are  isolated  is  given  by: 

P23  <  (1  —  47rr^)(l  —  2p7rr‘^ 

<  (1  -  2p7rr2(n))"-2 

(B.22) 

<  e-2(^-2W"(«)  from  Lemma  48 

<  g-21ogn-2b+?(L^ 

Then,  the  probability  p2  that  nodes  i  and  j  are  both  isolated  is  given  by: 


P2  =  P21{i)  +  P21{ii)  +  P22  +  P23 


(B.23) 


Let  us  first  consider  the  case  where  b{n)  =  6  is  a  constant. 


Pr[  disconnection  ]  >  Pr[x  is  only  isolated  node] 


> 


Pr[x  isolated  ]  —  Pr[x  and  y  both  isolated 


x,y 


=  npi  —  n{n  —  l)p2 

>  n(l  -  pTTr^{n))^^~^'>  -  n(n  -  l)(p2i(i)  +  P2i(ii)  +  P22  +  P23) 

>  Oe-^  -  n(n  -  1) 

_|_g-21ogn-6-loglogn  logn-|b  ^-2 log n-2b+ 

>  -  (1  +  e)e-2^ 

for  any  0  <  1,  e  >  0,  and  large  n  (Lemma  48,  Lemma  49) 

(B.24) 

Now  consider  the  case  where  b{n)  is  not  constant,  and  limsup  b{n)  =  b.  Then,  for 

n— >00 

any  e  >  0,  b{n)  —  h  <  e  for  large  n.  Since  the  probability  of  disconnection  monotonically 
decreases  in  6(n),  we  can  take  the  following  bound: 


(B.25) 


Pr[  disconnection  ]  >  6e  ~  (1  +  e)e  2(^+<=) 

(  for  large  enough  n) 

Thus,  if  limsup  b{n)  <  +oo,  the  network  is  asymptotically  disconnected  with  some 

n— >oo 

positive  probability. 
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Appendix  C 

Complete  Proof  of  Scheduling 
Result  (Theorem  13) 


Recall  the  notation  introduced  in  Chapter  5.  Also  recall  that  the  arrival  process  at  any  link 
is  i.i.d.  over  all  time-slots,  and  that  E[Xi{t)Xk{t)]  is  bounded,  i.e.,  E[Xi{t)Xk{t)]  <  rj  for  all 
I  £  C,k  £  C,  where  r/  is  a  suitable  constant  (hence  E[{Xi{t))‘^]  is  also  upper-bounded  by  t]). 
As  mentioned  in  Chapter  5,  we  adopt  the  following  convention;  at  the  beginning  of  each 
time-slot,  the  scheduling  decisions  are  taken,  and  transmissions  occur.  Then  new  arrivals 
occur  at  the  end  of  the  slot. 

Let  the  queue-length  of  the  queue  for  link  I  and  channel  c  at  the  start  of  time-slot  t  be 
denoted  by  gf(t).  Let  the  rate-allocated  to  link  I  in  slot  t  over  channel  c  be  denoted  by 
xf{t).  Since  we  are  considering  single- interface  nodes,  at  most  one  of  the  x^(t)’s  is  non-zero 
for  a  link  1.  Furthermore  xf{t)  =  0  if  link  I  is  not  scheduled  over  channel  c  in  slot  t,  and 
xf{t)  =  rf  else. 

Recall  that  r/  =  max  rf.  From  the  assumptions  stated  in  Chapter  5,  rf  >  0  for  all 
ceC 

I  £  C,c  £  C.  Resultantly,  ri  >  0  for  all  I  £  C,} 

The  queue  dynamics  are  as  follows: 


+  1)  =  +  AF(t)  -  xfit)  where  Xf{t) 


Ht)rf 

baC 


We  define  the  following  Lyapunov  function: 


(C.l) 


lec  cec 


Qtit) 


(C.2) 


This  Lyapunov  function  is  somewhat  similar  in  form  to  that  used  in  [120]. 

^  As  also  stated  in  Chapter  5,  the  results  can  be  easily  generalized  to  the  case  when  rf  =  0  for  some  I,  c. 
However,  even  in  those  scenarios,  it  is  reasonable  to  assume  that  ri  >  0  for  all  I  €  jC,  since  any  feasible 
load- vector  must  have  A;  =  0  for  any  link  I  with  r;  =  0,  and  such  links  can  be  ignored/eliminated  from 
consideration  beforehand. 
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It  can  be  seen  that; 


K,(^(t+i))-y,(t(t))  =  EE 

I&CcGC 


gf(^  +  1)  I  y^ 

'  I  \  A  n\^^r 


<*±11  ^  ^  5jfi±i) 


\k&A{i)dGC  feer(0 


-EE 


qfit)  I 


I  EE#+  E 


<fk{t) 


=  EE 


iGCcec 


rpC  I  Y'C 

leCcGC  I  *  \kGA{l)dGC  feel'(0 

^,d( 


(gf ft)  +  ifjt  +  1)  ~  g;°(^))  I  y^  y^  (gfc(ft  +  ^fcft  +  1)  ~  gfcft)) 

'  \k£A{l)d(^C 


+ 


y^  (gfcft)  +  gfcft  +  1)  ~  gfc(ft) 

,^C 


fcer(0 


-EE 

i^Ccec 


m  / 


W  j  y^  y^gfc(ft  ,  y^  gfcft) 

-•C  I  j,d 

'  \k£A{l)dGC  ^  fcGr(/) 


^  y^y^gfCft  /  y^  y^ gfcft)  ,  y^  gfc(ft 

iGCceC  I-  [  \kGAil)d£C  ^  fcGl'CO 


y^y^gfW  /  y^  y^  (gfc(^  +  1)  ~  gfc(ft)  ,  y^  (gfcft  +  1)  ~  gfc(ft) 
IGCcGC  '  \k£A{l)d&C  *  fcel'(0 


■EE 

l£Cc£C 

EE 

leCcec 


iqf{t  +  l)-qf{t)) 


I  y^  y^gfc(ft  I  y^  gfc(ft 

1  /  ^  ^  j^d  /  -j  ipC 

\kGA{l)dGC  ^  fcGl'(0 


(gf(t  +  1)  -  gf(t))  /  y^  \  "  (gfc(t  +  1)  -  gfc(t))  y^  (gfc(t  +  1)  -  gfc(t)) 
'  \kGAil)dGC  fcel'(0 


=  EE 

iGCcec 


gfft) 


_y^y^gf(ft  [  y^  y^gfc(ft  ,  y^  gfc(ft 

/  ^  ^  J.C  I  /  ^  ^  j.d  /  ^  j,c 

i&Cc&c  *  \keA{i)d€C  ^  fcei'(0 
/  y^  \  "  (gfc(^  +  1)  ~  gfc(ft)  ,  y^  (gfc(^  +  1)  ~  gfc(ft) 

I  /  ^  ^  rpd  /  ^  j,C 


-EE 

ie£  ceC 


1: 


gfcft) 


+  EE 

ie£  cgC 

=  2EE 

ie£  cgC 

+  EE 

ie£  cgC 


\keA{i)dec  'k  fcei'(0 

(gf(t  +  l)-gf(t)) 

rpd  T'^ 

keA{i)deC  k  fcGi'(i) 

(gf(t  +  1)  -  gf  (t))  /  y^  \  "  (gfc(t  +  1)  -  gfc(t))  y^  (gfc(^  +  1)  ~  gfc(ft) 

I  /  ^  ^  j,d  /  J 


\keAil)d&C 


kei'H) 


m 


(  y^  y^  (gfc(^  +  1)  ~  gfc(ft)  ,  y^  (gfc(^  +  1)  ~  gfc(ft) 

I  /  ^  ^  j,d  /  ^  j,c 


^  \kGA{l)d€C 

(gf(t  +  l)-gf(t)) 


kei'(i) 


i  y^  \  "  (gfc(^  +  1)  ~  gfc(ft)  ,  y^  (gfc(^  +  1)  ~  gfc(ft) 

I  /  ^  ^  j.d  /  J 

\keA(l)d€C 


kei'(i) 


since  k  G  A{1)  =1>  I  G  ^(/c)  and  k  G  Iftft  I  G  Iftfc)  from  the  symmetric  conflicts  assumption 

(C.3) 
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Denote  by  C'{t)  the  set  of  link-channel  pairs  {l,c)  for  which  qf{t)  >  rf.  This  set  of 
link-channel  pairs  participates  in  the  scheduling  process  for  slot  t.  By  design,  the  scheduler 
computes  a  maximal  schedule  over  all  participating  links.  Therefore,  for  all  {l,c)  G 


k&Ail)d£C  k 


fcGi'd)  ^ 


>  1 


(C.4) 


If  A  lies  within  the  7}|c|  rate- region,  then,  by  assumption,  there 

exists  some  scheduling  algorithm  that  achieves  stability  with  load  vector  ^  ^  _ 

Similar  to  [74],  we  can  argue  that  this  implies  existence  of  an  average  service-rate  vector 
for  all  I,  c  satisfying  the  following  for  some  e  >  0; 


(1  +  e)^ 


CTs 


(C.5) 


cec 


Q 

EE4  <  K^c\  links  I  (C.6) 

k&l'{l)c&C 

EE3  <  max{l,7}  for  all  links  I  (C.7) 

k&A{l)d&C  ^k 

Set  xl  =  7}|C|)  ~  from  (C.5),  (C.6)  and  (C.7),  we  obtain  that: 


(1  -|-  e)A;  <  '^xf  for  all  links  I 
ceC 


(C.8) 


EE 

k&l'{l)c&C 


< 


_ K\c\o's _ 

(1  +  e)(%|  +max{l,7}|C|) 


for  all  links  I 


<  _ max{l,7}o-^ _ 

khl)^C^'  ~  (l  +  ^)(%l+“ax{l,7}|C|) 


for  all  links  I 


(C.9) 


(C.IO) 
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This  yields  that  for  all  links  I: 


< 


b&C  \k(iA{l)d&C  k  k&V{l)  k 
max{l,7}(Ts|C| 


X 


E  e!i  +  E  E 

K\C\(^s 


d  rrb 

ZA 

k&A{l)d&c''~k  k&(l)b&c''~k 


+ 


(C.ll) 


CTs 


(1  +  e)(K|c|  +  max{l,7}|C|)  (1  +  e)(K|c|  +  max{l,  7}|C|)  1  +  e 


<  (Js 


Since  rf  <  for  all  channels  c,  therefore  r?  >  fr^r^  >  cr^rf  for  all  c  G  C.  Therefore, 


for  all  links  1: 


b£C 


E= 

EEI^+  E 


E*t\ 


bec 


fcerm 


<  I  V  y^y^— +  y^  y^— 

1  _  /  _  _ T,(i  _  \ 

s-E  E  E;J+  E  ;f  <1 

bec  \k&A{l)d£C  k  k&I'(l)  k  j 

using  (C.ll)) 


bec 


bec 


7 


(C.12) 


When  A;  =  0  for  all  I,  the  queues  are  trivially  stable.  Hence,  let  us  only  consider  the  case 

Let  Qinit  =  max 


where  A;  >  0  for  at  least  one  link  I  ^  C.  Let  Umin  =  min 

i&c,  \i>o\ 


l&C  'I 


bec 


i.e.,  Qinit  is  the  maximum  of  the  initial  queue-lengths.  Note  that  if  A/  =  0  for  some  link  I, 

then  ^  <  Qinit- 

Using  (C.3): 


E[VqCq{t+l))  -  VqCq  {t))\~q  (t)] 

= 2i:  { Y. 

lec  ceC  \keA{i)deC  k 


E 

fcer(0 


9fc(i  +  1)  -  9fe(0i 


EE^ 

lec  cec 


iqf{t  +  l)-qf{t))  / 


EE 

\keA{i)deC 


qt{t  +  l)-qtit) 


E 

fcer(0 


(9fe(^  +  l)-<7fc(i)) 


^  2EE^  e  e^ 

leCceC  I-  \keA{i)deC 


EE^ 

lec  cGC 


+  E 

\ - 

H 

fcer(0 

L  •  k  J 

Kit) 

T? 

(ee^+ 

'  1 

\keA{i)dec  k 

E 


Kit) 
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=  2y  y® 

lec  csc  ^ 


/ 


I  jp  Afc(i) 

^  ^  Vr*-  ^  ^  Vr'' 

I  k£A{l)deC  Z_^' k  kGl'(l)  /  J  k 

\  L  bec  beC 


-El  1:1:^+  S  ^ 

keA{i)deC  k  kei'ii)  ^ 


-EE^ 

ie£  cGC 


/ 


<2y  y® 

ie£  cgC  ^ 

=  2yy® 

Z-^Z-^  j^c 

l^Cc^C  ^ 


( 


.bec  \ 


E/  +  y^  -^^(0 

kGAii)deC  E^fc  fcei'(/)E^fe  /  I 

'  beC  beC  /  J 


I  p,  Y^  Y^Afe(i)  Afc(i) 

I  keA{l)d£C  k  fcGl'Cn  /  / 

\  L  bec  bec  J 


-  S 


/ 


,  EEy+ 1: 

\k^A(l)dGC  2^'^k  fcGl'(; 


At 


bec 


Z^  Y^ 

fcFl'Ci')  /  / 
bec 


-E 


Y^^fc(0  _|_  y^  ^fc(0 

j^d  n-iC 

■eA{i)d€C  k  kei'H)  ^ 
y^  y^^fc(0  ,  y^ 

/  ^  rpd  /  ^  ^C 

keA(i)dGC  k  fcei'(i) 


n\ 


+  Cl 


y 


=  2  y 

(i.c)G£'(t)  ' 


/ 


,  i:i:y+ 

\keA{i)dec /^’’'k  kGi'd)  /  /, 


+  Ci 


i\ 


bec 


bee 


+  21:^ 

(Z,c)e(£xC)-£'(t)  ' 


/ 


i:i:y+  E 

keAii)deC  ^’’'k  kei'H 
beC 


Xk 


fcei'(/)E^' 

bee 


(  y^  y^^fc(0  ,  y^ 

I  I  Z  -V  Z  ^  jtd  /  -J  ipC 

I  \keA{i)dee  ^  kev(i)  ^ 

/  y^  s^^kW  ,  y^  ^fc(0 

I  I  Z  -V  Z  ipd  /  -J  j,c 

I  \keA{i)dee  ^  kev(i)  , 


\ 


J 


EEy^+  e 


<2  y 

-  ^  2^  j-c  \ 

(i,c)eC'{t)  ’■  \  keA(i)dee  /  / 

.  \  bee 

/ 


/ 


At 


\  / 


EEyC+  E 

keA{i)dee  2^'^k  kei'{i 


I  V 

.b  ^  V„b  I  I  ^  Vrf-  ^ 

fc  kev(i)/_^'k  j  \keA{i)dee/_^'k  kev(i 
bee  bee  J  \  bee 


Y.Y. 


E< 


E<\ 

bee 

kei'ii)  E^fe  / 
bee  / 


bee 


^  Vr^ 

kel'(l)/_^'k  I 
bee  bee  / 


-  E 


+2  E  ^ 

ie(z;xC)-£'(i)  ' 


I  Z  ^  Z  ipd  /  -J  jtC  I 

_  \keA{i)dee  ^  kev{i)  ^  J  ^ 

( 


<2  y  ?E) 

Z—^  yiC 

(i,c)G£'(t)  ' 


— e 


Xk 


/ 

EEy  +  E^ 

^keA{i)dee Z^'^'k  kevd)  /  Z. 


,  EEy  +  E^ 

\keAii)dee Z^'^'k  kevd)  /  Zi 

\ 


\ 


Xk 

,b 

I  \ 

bee  /  J 


+  Ci 


bee 


.b 

k  j 

bee  / 


+2  E  ^ 

ie{Cxe)-c'(t)  ‘ 


( 


using  (C.8),  (C.4)  and  (C.12) 


V 


EE 

keA(i)dee 

bee 


—  +  y 

y^„b  ^ 

/  /  k  kel'{l 


At 


^  V' 

kei'd)  /  / 


+  Ci 


K 

bee  / 


+  Ci 
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^2  E 


m 


{l,c)&C'(t)  ' 


/  /fc  kpi'd)  /  / 1 


kGAii)dec  z_^' k  feel 
>  bec 


bec 


yj 


-2  E 


9F(0 


eyr, 


l£{CxC)-C'(t)  ' 


+  2 


E 


9F(i) 


^Urain  “t”  ^  ^  ^ 


(i,c)e(£xC)-£'(t) 


/e(£xC)-£'(t) 


gC(^) 

(subtracting  and  adding  back  2  E  ^ 


9FW 

I  I 


/  Jk  feel'»~)  /  /; 


kGAii)deC  z_^' k  feel 
b£C 


Cl 


bec 


ie{CxC)-C'(t)  ' 


<  E!  ^  1  ^  i~^ymin)  +  ‘^f^Umin  E!  E^Q 


init  -i-2ey 

min 


IGC  cGC  ^ 


E 


ie£  ceC 
Ai=0 


(i,c)e(z;xC)-£'(t) 


9FW 

rf 


+  2 


E 


9F(i) 


(i,c)e(£xC)-£'(t) 


i:i:E+ 1:  E 

k&A(l)d^C  2^''^ k  fcel'fi'l  /  /fe 


+  Ci 


bec 


bec 


<-2e^EE9FW  +  ^3 


'  l^Cc^C 


where  rmax  =  max  rf,  Ci  =  \^\\Cvii^aa^\c\+i^a:.)  ^  C3  =  Ci  +  2eymin\mC\Qinit  + 
l£C,c£C  ^^1^2  ^ 

l^l|C|+2|£||C|(^  max  I C I  +  I  max')  ■ 

Invoking  Lemma  2  from  [85],  this  proves  stability. 


268 


Appendix  D 

Auxiliary  Results  Used  in 
Broadcast  Proofs 

D.l  Justification  for  Approximate  Argument  used  in 
Section  8.6 

We  claimed  in  Section  8.6  of  Chapter  8  that,  given  a  simple  closed  region  S  of  area  A, 
and  perimeter  p,  bounded  by  up  to  k  straight  line  segments  and  circular  arcs  of  radius  r, 
where  A:  is  a  small  constant,  the  number  of  lattice  points  in  5  is  ^  ±  0{p).  We  justify 
this  by  bounding  S,  within  and  without,  by  lattice  polygons,  and  applying  Pick’s  Theorem 
[113].  For  any  such  region  S,  consider  the  lattice  polygon  comprising  grid  squares  that  lie 
completely  within  S  (Fig.  D.l).  In  certain  cases,  instead  of  a  single  lattice  polygon,  we 
obtain  a  number  of  simple  polygons  that  may  share  a  common  vertex,  or  are  disconnected 
(if  S  has  narrow  constrictions  or  necks  (Fig.  D.2)).  In  rare  instances,  no  such  polygon 
may  be  obtained,  if  S  is  extremely  narrow,  and  has  no  grid  square  lying  completely  within 
it  {A  =  0{p)  for  such  regions,  and  these  can  be  ignored).  We  call  the  polygon(s)  thus 
obtained  Pin  (in  case  of  multiple  polygons,  Pin  refers  to  their  union).  Note  that  S  —  Pin 
comprises  the  grid  squares  that  are  partially  in  S,  i.e.,  those  traversed  by  the  boundary 
of  S.  Since  the  boundary  of  S  comprises  up  to  k  line  segments  and  arcs  of  radius  r,  the 
number  of  grid  squares  traversed  by  the  boundary  is  at  most  2p-\-ck,  where  c  is  a  constant. 
The  area  of  Pin  must  thus  be  at  least  A  —  {2p  +  ck).  Let  ni  denote  the  number  of  lattice 
points  falling  in  Pin-  Similarly,  consider  the  lattice  polygon  Pout  obtained  by  taking  the 
union  of  all  grid  squares  that  lie  fully  or  partially  in  S.  Pout  is  simple,  fully  contains  S, 
and  its  area  can  be  no  more  than  A  +  {2p  +  ck)  (it  can  at  most  have  an  additional  area 
comprising  the  grid  squares  traversed  by  the  boundary  of  S).  Let  the  number  of  lattice 
points  falling  in  Pout  be  n2-  Then  ni  <  Ni  <  n2-  By  invoking  Pick’s  Theorem  it  can  be 

^Pick’s  Theorem:  Let  A  be  the  area  of  a  simple  closed  lattice  polygon.  Let  B  denote  the  number  of  lattice 
points  on  the  polygon  boundary,  and  I  the  number  of  points  in  the  polygon  interior.  Then:  A  —  I  +  —  1. 
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shown  that  ni  >  A  —  0{p),  and  n2  <  vl  +  0{p).  Thus  Ni  =  A±  0{p). 


Figure  D.l:  Bounding  a  Simple  Closed 
Region  via  Lattice  Polygons 


Figure  D.2:  Region  with  Neck:  Multiple 
Simple  Polygons  in  Interior 


D.2  Calculation  of  Collective  Area  of  Regions  A  and  Bi 
from  Section  8.6. 

Consider  Fig.  D.3.  Denote  the  regions  within  distance  r  of  nodes  N  and  M  by  nbd{N) 
and  nbd{M)  respectively.  Then  the  collective  area  of  regions  A  and  Bi  =  Area  of  nbd{N)  n 
nbd{M)  -  Area  of  Sector  HMJ  +  Area  of  AHMJ.  We  show  the  calculations  below.  All 
angles  are  in  radians.  Sector  KMR  (HMJ)  or  A  KMR  (HMJ)  refers  to  the  sector/triangle 
subtending  obtuse  (and  not  reflex)  angle  KMR  (HMJ)  at  M. 

1.  Area  of  nbd{N)  n  nbd{M)  =  2  (  Area  of  Sector  KMR  -  Area  of  A  KMR). 

Area  of  Sector  KMR  =  ttA  =  vrr^  ~  (^^(cos“^(|)))  ~  1.318r^  for 

sufficiently  large  r. 

Area  of  A  KMR  =  ^As\n{/.KMR)  k,  0.242r^. 

Thus,  Area  of  nbd{N)  n  nbd{M)  =  2(1.318  —  0.242)r^  =  2(1.076)r^  =  2.152r^. 

2.  Area  of  AHMJ  =  sin{ZH M J)  =  sin(2  cos“^(^^))  0.433r^. 

3.  Area  of  Sector  HMJ  =  ttA-  =  1.047r^. 

Thus  collective  area  of  A  and  Bi  is  give  by: 

2.152r2  -  1.047r2  +  =  l.SJSr^  ss  O.Jgvrr^. 
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Appendix  E 

Useful  Mathematical  Results 


In  this  appendix,  we  state  some  results  that  have  been  used  in  many  of  our  proofs.  Many 
of  these  are  well-known  results. 

Fact  1.  For  all  0  <  x  <  1: 

In -  >  X 

1  —  X 

Fact  2.  For  all  0  <  x  <  1: 

(1  -  x)  <  e-^ 

Lemma  50.  (Jogdeo  &  Samuels  [47])  Given  X  =  Yi  +  Y2  +  ...^+Yn  where  yi,Yi  = 
Bernoulli{pi),  and  Y[Pi  —  median  m  of  the  distribution  is  either  Ynp\or\np\, 

i.e.,  Pr[X  <  m]  >  I  and  Pr[X  >m]>^. 

Lemma  51.  (Chernoff  Bound  [83])  Let  Xi,...,Xn  be  independent  Poisson  trials,  where 

n 

Pr[Xi  =  1]  =  Pi.  Let  X  =  Then,  for  any  (3  >  0." 

i=l 

/  J  N  E\X] 

+  (E,l) 

Lemma  52.  ( Chernoff  Upper  Tail  Bound  [83])  Let  Xi, ...,  be  independent  Poisson  trials, 

n 

where  Pr[Xj  =  1]  =  pi.  Let  X  =  Then,  for  0  <  /3  <  1." 

i=l 

Pr[X  >  (1  +  (5)E[X]\  <  exp(-^F[X])  (E.2) 

n 

Lemma  53.  (Chernoff  Lower  Tail  Bound  [83])  If  X  =  'ffXi,  where  each  Xi  is  independent 

i=l 

and  Bernoulli{pi) ,  then  for  <  [5  <1: 

Pr[X  <  (1  -  ^)E[X]]  <  exp(-^F[X])  (E.3) 
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Lemma  54.  (Relative  Entropy  Form  of  Chernoff-Hoeffding  Bound[45])  If  X  = 

i=l 

where  each  Xi  is  Bernoulli{p) ,  then  for  p  <  /?  <  1; 

Pr[X  >  (5n]  <  (E.4) 

Lemma  55.  The  chernoff  bounds  continue  to  apply  if  the  Poisson  trials  are  not  indepen¬ 
dent,  hut  are  negatively  correlated. 

This  is  a  well-known,  and  often-used  result.  See  [87,  30].  Also  see  the  proof  for  the 
Chernoff  bound  in  [83],  from  which  it  can  be  seen  that  this  holds. 

Lemma  56.  [24]  If  Xi,  X2,...,  Xn  are  drawn  i.i.d.  from  alphabet  x  according  to  Q{x), 
then  probability  of  sequence  x  is  given  by: 

=  (E.5) 

where  H  and  P  denote  the  entropy  and  relative  entropy  functions  (here  considered  w.r.t 
base  e). 

Also,  for  any  distributions  P  and  Q,  the  size  of  type  class  T{P)  satisfies: 

- <  |r(p)|  <  (E.6) 

(n -7  1)1x1  ^  ^ 

and,  the  probability  of  the  type  class  T{P)  under  Q  is  governed  by: 

- ^  ^-niD{P\\Q))  <  qW(E(p))  <  e-(^(^IIQ))  (E.7) 

(n  -|-  1)1x1 


Lemma  57.  (Vapnik-Chervonenkis  Theorem)  Let  S  be  a  set  with  finite  VC  dimension 
VCdim{S).  Let  {Xi}  be  i.i.d.  random  variables  with  distribution  P.  Then  for  e,6  >  0." 


Pr  sup 
\D&S 


1  ^ 

P{D) 


>1-6 


whenever  N  >  max 


f8VCdim(S),  16e  4,  2 

- log2 - >-log2T 

6  6  6  0 


1 
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Lemma  58.  Suppose  we  are  given  a  region  of  area  n,  with  n  nodes  located  uniformly  at 
random.  Consider  all  axis-parallel  rectangles  of  area  a{n).  If  a{n)  =  lOOalogn,  1  <  a  < 
loo^ogn^  Ihen  each  such  rectangle  has  at  least  100a Inn  —  50 log n  nodes,  with  probability  at 
least  I 


Proof.  It  is  known  that  the  set  of  axis-parallel  rectangles  in  has  VC-dimension  4.  We 
consider  the  set  of  all  axis-parallel  rectangles  S  of  area  100a Inn.  Then  considering  the  n 
random  variables  Xi  denoting  node  positions,  Pr[W  G  D{D  G  5)]  =  11111^1°^.  Then,  from 
the  VC-theorem  (Lemma  57): 


Pr  I 


sup 

\D&S 


No.  of  nodes  in  D  100a Inn 


n 


n 


<  e(n)  ]  >  1  —  6{n) 


,  /32^  16e  4^  2 

whenever  n  >  max  I  —  log2 - ,  -  log2  - 


This  is  satisfied  when  e(n)  =  6 (n)  =  Thus,  with  probability  at  least  1  —  the 

population  Pop(Il)  of  cell  D  satisfies: 


100a  In  n  —  50  In  n  <  Pop(Il)  <  100a  In  n  -|-  50  In  n  (E.8) 


This  completes  the  proof.  □ 

Fact  3.  If  we  attempt  to  divide  a  -^/n  x  ^/n  grid  into  disjoint  neighborhoods  in  the  L^o 
metric  (as  in  Fig.  9.1),  then  the  number  of  such  disjoint  neighborhoods  that  can  be  obtained 
is  at  least  for  large  n.  Observing  that  d  =  4r^  -|-  4r  and  d  >  dmin  = 

8,  the  number  of  such  disjoint  neighborhoods  obtainable  is  at  least  —  lo^r+i  — 

—  23  large  n,  whenever  r  is  such  that  d  =  o(n). 

Lemma  59.  Suppose  we  are  given  a  unit  torus  with  n  nodes  located  uniformly  at  random, 
and  the  region  is  sub-divided  into  axis-parallel  square  cells  of  area  a{n)  each.  If  a{n)  = 
iooa(n)iogn^  ^  ^  Q;(n)  <  then  each  cell  has  at  least  (lOOa(n)  —  50)  logn,  and  at  most 

(lOOa(n)  -I-  50)  logn  nodes,  with  probability  at  least  1  — 

Proof.  It  is  known  that  the  set  of  axis-parallel  squares  in  has  VC-dimension  3.  In  our 
construction,  we  have  a  set  of  axis-parallel  square  cells  S  such  that  the  cells  all  have  area 
Then  considering  the  n  random  variables  Xi  denoting  node  positions. 
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.  Then,  from  the  VC-theorem  (Lemma  57): 


Pr[Xi  £  D{D  £  5)]  = 


_  IOQq:  log  n 

n 


Pr  (  sup 
\d&s 


No.  of  nodes  in  D  100a(n)logn 
n  n 


<  e(n)  ]  >  1  —  S{n) 


,  ,'24^  16e  4^  2 

whenever  n  >  max  —  logo - ,  -  logo  - 

'  e  e  €  ^  6 


This  is  satisfied  when  e(n)  =  5{n)  =  Thus,  with  probability  at  least  1  —  ^ ,  the 

population  Pop(i7)  of  cell  D  satisfies: 


(100Q;(n)  —  50)  log  n  <  Pop(i7)  <  (lOOa(n)  +  50)  log  n 


(E.9) 


□ 

Lemma  60.  Suppose  we  are  given  a  unit  torus  with  n  points(or  nodes)  located  uniformly 
at  random,  let  us  consider  the  set  of  all  circles  of  radius  R  and  area  A{n)  =  ttR^  on 
the  unit  torus.  If  A{n)  =  lUlMLlLliL; ^  1  <  Q;{n)  <  xooTT^^  then  each  circle  has  at  least 
(lOOa(n)  —  50)  logn,  and  at  most  (lOOa(n)  +  50)  logn  of  these  points  (or  nodes),  w.h.p. 

Proof.  The  set  of  all  circles  of  radius  R  in  has  VC-dimension  3  (e.g.,  see  [43]).  Thereafter 
by  the  same  argument  as  in  the  proof  of  Lemma  59,  the  result  proceeds.  □ 

Lemma  61.  If  n  pairs  of  points  {Pi,Qi)  are  chosen  uniformly  at  random  in  a  unit  area 
torus  divided  into  square  cells  of  area  a{n)  =  the  resultant  set  of  straight-line 

formed  by  each  pair  Li  =  PiQi  satisfies  the  condition  that  each  cell  has  0{n^,/ a{n))  lines 
passing  through  it  w.h.p. 

Proof.  Given  the  lines  Lj  are  i.i.d.,  the  proof  argument  of  Lemma  3  in  [36]  can  be  applied 
to  prove  this  result.  □ 

Lemma  62.  The  number  of  subsets  of  size  k  chosen  from  a  set  of  m  elements  is  given  by 

O  <  (t)‘- 

Theorem  31.  (Hall’s  Marriage  Theorem  [44],  [9^])  Given  a  set  S,  let  T  =  {Ti,72, . .  .Tn} 
be  a  finite  system  of  subsets  of  S.  Then  T  possesses  a  system  of  distinct  representatives 
if  and  only  if  for  each  k  in  1,  2, ..,  n,  any  selection  of  k  of  the  sets  %  will  contain  between 
them  at  least  k  elements  of  S.  Alternatively  stated:  for  all  A  ’T  T,  the  following  is  true: 
\uA\  >  \A\ 
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Theorem  32.  (Integrality  Theorem  [22])  If  the  capacity  function  of  a  network  flow  graph 
takes  on  only  integral  values  (i.e.,  each  edge  has  integer  capacity),  then  the  maximum  flow  x 
produced  by  the  Ford- Fulkerson  method  has  the  property  that  \x\  is  integer-valued.  Moreover, 
for  all  vertices  u  and  v,  the  value  of  x{u,v)  is  an  integer. 
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